Icon

version1

1. DATA PREPROCESSING AND EXPLORATORY ANALYSIS

3. CLUSTERING

2. FEATURE ENGINEERING

MEA 2025/2026 — Group Project — NEW Super Markets International

Group: GROUP BW

Members:

Engagement lens

Customer-value lens

Product-mix lens

Color each cluster for further visualisation
Color Manager
We reverse the scaling
Denormalizer
Segments profile
GroupBy
Percentage_Canned
Math Formula
Renames cluster_2
String Replacer
Percentage_Beverages
Math Formula
Percentage_Frozen
Math Formula
Renames cluster_0
String Replacer
We reverse the scaling
Denormalizer
Renames cluster_1
String Replacer
Percentage_Perishables
Math Formula
confirms cluster separation
Distance Matrix Calculate
Normalizer
Renames cluster_0
String Replacer
Segments profile
GroupBy
Renames cluster_2
String Replacer
Renames cluster_1
String Replacer
Bar Chart
Bar Chart
Bar Chart
Column Filter
Silhouette Coefficient
Silhouette Coefficient
Column Filter
Silhouette Coefficient
Silhouette Coefficient
Silhouette Coefficient
Numerical profile: mean, std, missing counts per variable
Statistics
Stron Skewness detected in Recency
Histogram
We uploaded the dataset
Excel Reader
Table Manipulator
Monetary X income
Scatter Plot
Change the 0 in Education with the most occured variable
String Replacer
Silhouette Coefficient
Column Filter
CUSTID column was corrected
RowID
Column Filter
Correlation Insights
Linear Correlation
Changes made on Income , Gender and Marital Status
Missing Value
k-Means with k=4
k-Means
Histogram
k-Means with k=5
k-Means
Cap the value greater than 100 in internet
Rule Engine
k-Means with k=5
k-Means
Bar Chart
k-Means with k=3
k-Means
Numeric Outliers
k-Means with k=3
k-Means
k-Means with k=4
k-Means
Table Manipulator
Shows us that Gender contains missing values
Bar Chart
Helps us detect a missing value in Marital Status
Bar Chart
Helps us detect a 0 in Education
Bar Chart
k-Means with k=4
k-Means
String Replacer
k-Means with k=3
k-Means
Table View
Monetary is strongly right skewed
Histogram
String Replacer
Color each cluster for further visualisation
Color Manager
Numeric Outliers
k-Means with k=5
k-Means
For Internet values are above 100 detected (It cant be as its a percentage)
Histogram
String Replacer
Statistics
Silhouette Coefficient
String Replacer
We reverse the scaling
Denormalizer
Column Filter
Color each cluster for further visualisation
Color Manager
String Replacer
confirms cluster separation
Distance Matrix Calculate
Missing Value
Silhouette Coefficient
Income_Share
Math Formula
Renames cluster_0
String Replacer
Statistics
Top overall score
Silhouette Coefficient
Excel Writer
Avg_Visit
Math Formula
Segments profile
GroupBy
Statistics
Percentage_Others
Math Formula
Renames cluster_2
String Replacer
Renames cluster_1
String Replacer
Math Formula (Multi Column)
confirms cluster separation
Distance Matrix Calculate
Bar Chart

Nodes

Extensions

Links