Icon

Basic Customer Segmentation Use Case with weka--Telecom dataset

Basic Customer Segmentation
# Columns considered: Total: 8==============1. DayMins2. DayCharge3. EveMins4. Evecharge5. IntlMins6. IntlCalls7. IntlCharge8. CustServeCalls===============3-clusters and 99 iterationsDistance Manhattan:Also include Intl Plan and Vmail Plan asaso Eve Calls when distance metric isManhattan.#### Try all three normalizationmethods to get clusters Try with both Euclidean and Manhattandistances. With Manhattan, one caninclude more variables in the set. Scree PlotNo of clusters (x) vs Sum of within cluster squared distances (y,Euclidean). For every x (say 2), develop kmeans-model usingSimplKMeans node and from model output note down 'Sum ofsquared errors'; this is 'y'. Enter each pair of (x,y) into Table node.Then, get next (x,y) and so on. Do parallel plots show any structure when data is random? No. Last amended: 30th Dec, 2019Clustering of churn data leading to three clear clusters. Dataset: https://www.kaggle.com/becksddf/churn-in-telecoms-dataset (Kaggle) Use column filter to select just eightcolumns as mentioned in anotherannotation. Remove 'Churn' column also. Select 8 + churn column ReadingContractData.csvJoin twostreamsNormalizenumerical cols [0,1]ReadingCallsData.xlsConvert churnto String Node 145Node 146Node 148Day Mins vs eve chargeSee clear clustersTo select 2500 ptsfor plottingDay Mins vs eve charge vs Intl chargesRotate and seeclear clustersWith Manhattanone can be moreliberal in selectionof columnsTry both Eucl &ManhattanNode 158Node 159Node 162Node 163Node 164Node 165Do selected datahave some structure? Yes.Compare it with random datacolumn c4is targetnormalize allbut column c4No patternTransfornm c4to string File Reader(deprecated) Joiner (deprecated) Normalizer (PMML) Excel Reader (XLS)(deprecated) Number To String Scatter Plot(local) Color Manager Shape Manager Scatter Plot Shuffle MISSING 3D ScatterPlot (Plotly) Column Filter MISSINGSimpleKMeans (3.7) MISSING Weka ClusterAssigner (3.7) Number To String Table Creator Line Plot Normalizer MISSING 2D/3DScatterplot ParallelCoordinates Plot MISSING RandomData Generator Normalizer ParallelCoordinates Plot Number To String # Columns considered: Total: 8==============1. DayMins2. DayCharge3. EveMins4. Evecharge5. IntlMins6. IntlCalls7. IntlCharge8. CustServeCalls===============3-clusters and 99 iterationsDistance Manhattan:Also include Intl Plan and Vmail Plan asaso Eve Calls when distance metric isManhattan.#### Try all three normalizationmethods to get clusters Try with both Euclidean and Manhattandistances. With Manhattan, one caninclude more variables in the set. Scree PlotNo of clusters (x) vs Sum of within cluster squared distances (y,Euclidean). For every x (say 2), develop kmeans-model usingSimplKMeans node and from model output note down 'Sum ofsquared errors'; this is 'y'. Enter each pair of (x,y) into Table node.Then, get next (x,y) and so on. Do parallel plots show any structure when data is random? No. Last amended: 30th Dec, 2019Clustering of churn data leading to three clear clusters. Dataset: https://www.kaggle.com/becksddf/churn-in-telecoms-dataset (Kaggle) Use column filter to select just eightcolumns as mentioned in anotherannotation. Remove 'Churn' column also. Select 8 + churn column ReadingContractData.csvJoin twostreamsNormalizenumerical cols [0,1]ReadingCallsData.xlsConvert churnto String Node 145Node 146Node 148Day Mins vs eve chargeSee clear clustersTo select 2500 ptsfor plottingDay Mins vs eve charge vs Intl chargesRotate and seeclear clustersWith Manhattanone can be moreliberal in selectionof columnsTry both Eucl &ManhattanNode 158Node 159Node 162Node 163Node 164Node 165Do selected datahave some structure? Yes.Compare it with random datacolumn c4is targetnormalize allbut column c4No patternTransfornm c4to stringFile Reader(deprecated) Joiner (deprecated) Normalizer (PMML) Excel Reader (XLS)(deprecated) Number To String Scatter Plot(local) Color Manager Shape Manager Scatter Plot Shuffle MISSING 3D ScatterPlot (Plotly) Column Filter MISSINGSimpleKMeans (3.7) MISSING Weka ClusterAssigner (3.7) Number To String Table Creator Line Plot Normalizer MISSING 2D/3DScatterplot ParallelCoordinates Plot MISSING RandomData Generator Normalizer ParallelCoordinates Plot Number To String

Nodes

Extensions

Links