Icon

Customer Segmentation

<p><strong>Customer Segmentation</strong></p><p>This workflow implements basic customer segmentation through a clustering algorithm, k-Means. Customer segmentation can help Sales and Marketing departments identify hidden patterns in customer behavior/preferences and define better expansion and retention strategies:</p><ul><li><p>The sample data provided represents customer transaction and spending behaviors, as well as customer demographics. Raw data is joined, partitioned (existing vs. new customers) and preprocessed (missing value handling, outlier detection and normalization).</p></li><li><p>Using the Elbow method, it's possible to visually estimate the best number (k) of clusters for the algorithm. Next, the k-Means algorithm is used to segment customers and the obtained clusters are assigned to new customer.</p></li><li><p>Obtained clusters can be visualized and inspected in an interactive view and their quality assessed with a scoring metric. Clustered new data can be further exported for further processing or reporting.</p></li></ul>

Clustering

  1. k-Means

  2. Assign clusters to new customers

Data Pre-Processing

  1. Data partitioning (current vs. new customers to cluster)

  2. Missing value handling

  3. Outlier detection

  4. Normalization (z-score)

  5. Dimensionality reduction (PCA) for cluster viz

  6. Elbow method to determine the value of k

Cluster Visualization & Evaluation

  1. Denormalization

  2. Cluster Visualization

  3. Cluster quality evaluation (Silhouette coefficient)

  4. Save results


Customer Segmentation


This workflow implements basic customer segmentation through a clustering algorithm, k-Means. Customer segmentation can help Sales and Marketing departments identify hidden patterns in customer behavior/preferences and define better expansion and retention strategies:

  • The sample data provided represents customer transaction and spending behaviors, as well as customer demographics. Raw data is joined, partitioned (existing vs. new customers) and preprocessed (missing value handling, outlier detection and normalization).

  • Using the Elbow method, it's possible to visually estimate the best number (k) of clusters for the algorithm. Next, the k-Means algorithm is used to segment customers and the obtained clusters are assigned to new customer.

  • Obtained clusters can be visualized and inspected in an interactive view and their quality assessed with a scoring metric. Clustered new data can be further exported for further processing or reporting.

Data Reading

  1. Customer transaction and spending behavior

  2. Customer demographics

Missing value handlingOutlier detectionNormalization
Apply data processing
Top: Existing CustomersBottom: New Customers
Join and partition
Customer transactions
Excel Reader
To new customer data
Cluster Assigner
DBSCAN
Customer demographics
CSV Reader
Scorer
Numeric Distances
k=4
k-Means
Denormalizer
Visualize clusters
Find number of clusters(k) by visual inspection
Elbow Method
Missing value handlingOutlier detectionNormalization
Data processing
Check cluster quality
Silhouette Coefficient
Save clustered newdata
Excel Writer

Nodes

Extensions

Links