0 ×

Basic Customer Segmentation Use Case with weka--Telecom dataset

Workflow

Basic Customer Segmentation
clusteringk-Meanscustomer segmentationMy folder: D:\data\OneDrive\Documents\knime-workspace\Example Workflows\Customer Intelligence\Customer SegmentationManhattan distancescreeplot
# Columns considered: Total: 8==============1. DayMins2. DayCharge3. EveMins4. Evecharge5. IntlMins6. IntlCalls7. IntlCharge8. CustServeCalls===============3-clusters and 99 iterationsDistance Manhattan:Also include Intl Plan and Vmail Plan asaso Eve Calls when distance metric isManhattan.#### Try all three normalizationmethods to get clusters Try with both Euclidean and Manhattandistances. With Manhattan, one caninclude more variables in the set. Scree PlotNo of clusters (x) vs Sum of within cluster squared distances (y,Euclidean). For every x (say 2), develop kmeans-model usingSimplKMeans node and from model output note down 'Sum ofsquared errors'; this is 'y'. Enter each pair of (x,y) into Table node.Then, get next (x,y) and so on. Do parallel plots show any structure when data is random? No. Last amended: 30th Dec, 2019Clustering of churn data leading to three clear clusters. Dataset: https://www.kaggle.com/becksddf/churn-in-telecoms-dataset (Kaggle) Use column filter to select just eightcolumns as mentioned in anotherannotation. Remove 'Churn' column also. Select 8 + churn column ReadingContractData.csvJoin twostreamsNormalizenumerical cols [0,1]ReadingCallsData.xlsConvert churnto String Node 145Node 146Node 148Day Mins vs eve chargeSee clear clustersTo select 2500 ptsfor plottingDay Mins vs eve charge vs Intl chargesRotate and seeclear clustersWith Manhattanone can be moreliberal in selectionof columnsTry both Eucl &ManhattanNode 158Node 159Node 162Node 163Node 164Node 165Do selected datahave some structure? Yes.Compare it with random datacolumn c4is targetnormalize allbut column c4No patternTransfornm c4to string File Reader Joiner Normalizer (PMML) Excel Reader (XLS)(deprecated) Number To String Scatter Plot(local) Color Manager Shape Manager Scatter Plot Shuffle 3D ScatterPlot (Plotly) Column Filter SimpleKMeans (3.7) Weka ClusterAssigner (3.7) Number To String Table Creator Line Plot Normalizer 2D/3D Scatterplot ParallelCoordinates Plot Random DataGenerator Normalizer ParallelCoordinates Plot Number To String # Columns considered: Total: 8==============1. DayMins2. DayCharge3. EveMins4. Evecharge5. IntlMins6. IntlCalls7. IntlCharge8. CustServeCalls===============3-clusters and 99 iterationsDistance Manhattan:Also include Intl Plan and Vmail Plan asaso Eve Calls when distance metric isManhattan.#### Try all three normalizationmethods to get clusters Try with both Euclidean and Manhattandistances. With Manhattan, one caninclude more variables in the set. Scree PlotNo of clusters (x) vs Sum of within cluster squared distances (y,Euclidean). For every x (say 2), develop kmeans-model usingSimplKMeans node and from model output note down 'Sum ofsquared errors'; this is 'y'. Enter each pair of (x,y) into Table node.Then, get next (x,y) and so on. Do parallel plots show any structure when data is random? No. Last amended: 30th Dec, 2019Clustering of churn data leading to three clear clusters. Dataset: https://www.kaggle.com/becksddf/churn-in-telecoms-dataset (Kaggle) Use column filter to select just eightcolumns as mentioned in anotherannotation. Remove 'Churn' column also. Select 8 + churn column ReadingContractData.csvJoin twostreamsNormalizenumerical cols [0,1]ReadingCallsData.xlsConvert churnto String Node 145Node 146Node 148Day Mins vs eve chargeSee clear clustersTo select 2500 ptsfor plottingDay Mins vs eve charge vs Intl chargesRotate and seeclear clustersWith Manhattanone can be moreliberal in selectionof columnsTry both Eucl &ManhattanNode 158Node 159Node 162Node 163Node 164Node 165Do selected datahave some structure? Yes.Compare it with random datacolumn c4is targetnormalize allbut column c4No patternTransfornm c4to stringFile Reader Joiner Normalizer (PMML) Excel Reader (XLS)(deprecated) Number To String Scatter Plot(local) Color Manager Shape Manager Scatter Plot Shuffle 3D ScatterPlot (Plotly) Column Filter SimpleKMeans (3.7) Weka ClusterAssigner (3.7) Number To String Table Creator Line Plot Normalizer 2D/3D Scatterplot ParallelCoordinates Plot Random DataGenerator Normalizer ParallelCoordinates Plot Number To String

Download

Get this workflow from the following link: Download

Nodes

Basic Customer Segmentation Use Case with weka--Telecom dataset consists of the following 25 nodes(s):

Plugins

Basic Customer Segmentation Use Case with weka--Telecom dataset contains nodes provided by the following 7 plugin(s):