Icon

01_​Training_​a_​Churn_​Predictor_​Solution

Training a Churn Predictor
Data ReadingRead the following data tables from the datafolder and check the output tables to getfamiliar with the data CallsData.xlsContractData.csvTip 1: Drag&drop the files from the KNIMEExplorer to the canvas and check the previewin the configuration windowTip 2: Activate the checkbox "Table containscolumn headers ..." for the CallsData.xls tableCallsData.xlsActivate the checkbox "Table contains columnnames..." Pre-Processing1. Join the two data tables based on the columns "Area Code" and"Phone" using a Joiner node2. Change the data type of the columns Churn and Area Codefrom integer to string using the Number To String node3. Add a color to each row based on the column churn with theColor Manager node4. Create a training and a test set with Partitioning node Model Training and Evaluation1. Create a training and a test set with Partitioning nodeRecommended settings:- Select "Relative" and set it to 80 %- Select "Stratified sampling" and use the column "Churn"2. Train a decision tree with the Decision Tree Learner node to predict the which customers are likelyto churn by using the training set.Recommended settings:- Connect the upper output port of the Partitioning node with the Learner node- Select the column "Churn" as Class Column3. Apply the trained decision tree model to the test set (lower output port of the Partitioning node)using the decisoin tree predictor node.4. Evalutate the performance of the trained model using the Scorer node. Which customers are happier with their contract? Frequent phone users orinfrequent users?1. Use the Math Formula node to calculate the total number of minutesRecommended settings: use the following expression in the Math Formula node and append a newcolumn with the column name Mins Total$Day Mins$+$Eve Mins$+$Night Mins$+$Intl Mins$2. Use the Auto-Binner node to create 10 Bins Recommended settings:Include only the column “Min Total”, set the Number of bins to 10, and select Midpoints for the BinNaming option3. Use the Pivoting node to find out how many customers in a bin churned.Recommended settings:- Groups: Include Min Total [Binned]- Pivots: Include Churn- Manuel Aggregation: Any Column with aggregation method count4. Plot your results in a percentage area chart using the Stacked Area Chart nodeRecommended settings:- Column for x-axis: Mins Total [Binned]- Include all columns- Select Percentage-Area-Chart in the second tab (General Plot Options) Calls DataContractDatabased ofArea Code and Phonearea codeand churn ->StringColor by Churn80%vs. 20%class = ChurnNode 109Node 110Node 117Node 119Node 120Node 123 Excel Reader (XLS) File Reader Joiner Number To String Color Manager Partitioning DecisionTree Learner Decision TreePredictor Scorer Math Formula Auto-Binner Pivoting Stacked Area Chart Data ReadingRead the following data tables from the datafolder and check the output tables to getfamiliar with the data CallsData.xlsContractData.csvTip 1: Drag&drop the files from the KNIMEExplorer to the canvas and check the previewin the configuration windowTip 2: Activate the checkbox "Table containscolumn headers ..." for the CallsData.xls tableCallsData.xlsActivate the checkbox "Table contains columnnames..." Pre-Processing1. Join the two data tables based on the columns "Area Code" and"Phone" using a Joiner node2. Change the data type of the columns Churn and Area Codefrom integer to string using the Number To String node3. Add a color to each row based on the column churn with theColor Manager node4. Create a training and a test set with Partitioning node Model Training and Evaluation1. Create a training and a test set with Partitioning nodeRecommended settings:- Select "Relative" and set it to 80 %- Select "Stratified sampling" and use the column "Churn"2. Train a decision tree with the Decision Tree Learner node to predict the which customers are likelyto churn by using the training set.Recommended settings:- Connect the upper output port of the Partitioning node with the Learner node- Select the column "Churn" as Class Column3. Apply the trained decision tree model to the test set (lower output port of the Partitioning node)using the decisoin tree predictor node.4. Evalutate the performance of the trained model using the Scorer node. Which customers are happier with their contract? Frequent phone users orinfrequent users?1. Use the Math Formula node to calculate the total number of minutesRecommended settings: use the following expression in the Math Formula node and append a newcolumn with the column name Mins Total$Day Mins$+$Eve Mins$+$Night Mins$+$Intl Mins$2. Use the Auto-Binner node to create 10 Bins Recommended settings:Include only the column “Min Total”, set the Number of bins to 10, and select Midpoints for the BinNaming option3. Use the Pivoting node to find out how many customers in a bin churned.Recommended settings:- Groups: Include Min Total [Binned]- Pivots: Include Churn- Manuel Aggregation: Any Column with aggregation method count4. Plot your results in a percentage area chart using the Stacked Area Chart nodeRecommended settings:- Column for x-axis: Mins Total [Binned]- Include all columns- Select Percentage-Area-Chart in the second tab (General Plot Options) Calls DataContractDatabased ofArea Code and Phonearea codeand churn ->StringColor by Churn80%vs. 20%class = ChurnNode 109Node 110Node 117Node 119Node 120Node 123 Excel Reader (XLS) File Reader Joiner Number To String Color Manager Partitioning DecisionTree Learner Decision TreePredictor Scorer Math Formula Auto-Binner Pivoting Stacked Area Chart

Nodes

Extensions

Links