Icon

Pro_​LDD_​Training_​Churn_​Data

Training a Churn Predictor - Solution

Solution to exercise 13 for the KNIME Analytics Platform for Data Wranglers course
- Join data from different sources
- Apply color formatting using the Color Manager node
- Create a train and test set partitioning the data
- Train a decision tree model and evaluate the performance
- Calculate feature using the Math Formula node
- Group data into bins
- Pivot and visualize data

URL: Churn Prediction https://www.knime.org/knime-applications/churn-prediction
URL: Slides (KNIME Analytics Platform for Data Wranglers) https://www.knime.com/form/material-download-registration

Data ReadingRead the following data tables from the data folder and check the output tables to get familiar with the data CallsData.xlsContractData.csv

Pre-Processing

1. Join the two data tables based on the columns "Area Code" and "Phone" using a Joiner node and add the rows of the Newtable_pro table

2. Change the data type of the columns Churn and Area Code from integer to string using the Number To String node

3. Add a color to each row based on the column churn with the Color Manager node

4. Use a Metanode







Model Training and Evaluation1. Create a training and a test set with Partitioning nodeRecommended settings:- Select "Relative" and set it to 80 %- Select "Stratified sampling" and use the column "Churn"2. Train a decision tree with the Decision Tree Learner node to predict the which customers are likely to churn by using the training set.Recommended settings:- Connect the upper output port of the Partitioning node with the Learner node- Select the column "Churn" as Class Column3. Apply the trained decision tree model to the test set (lower output port of the Partitioning node) using the decisoin tree predictor node.4. Evalutate the performance of the trained model using the Scorer node and the ROC Curve.
Optional: Which customers are happier with their contract? Frequent phone users or infrequent users?1. Use the Math Formula node to calculate the total number of minutesRecommended settings: use the following expression in the Math Formula node and append a new column with the column name Mins Total$Day Mins$+$Eve Mins$+$Night Mins$+$Intl Mins$2. Use the Auto-Binner node to create 10 Bins Recommended settings:Include only the column “Min Total”, set the Number of bins to 10, and select Midpoints for the Bin Naming option3. Use the Pivoting node to find out how many customers in a bin churned.Recommended settings:- Groups: Include Min Total [Binned]- Pivots: Include Churn- Manuel Aggregation: Any Column with aggregation method count4. Plot your results in a percentage area chart using the Stacked Area Chart nodeRecommended settings:- Column for x-axis: Mins Total [Binned]- Include all columns- Select Percentage-Area-Chart in the second tab (General Plot Options)
DDO Laboratory - Churn Prediction How to train a basic machine learning model for a churn prediction task, using a Decision Tree algorithm.
CSV Reader
Excel Reader
Stacked Area Chart (JavaScript) (legacy)
Table Reader
ROC_Immagine
Image to Report (BIRT)
Pre-Processing
ROC Curve (JavaScript) (legacy)
Scorer
Table Partitioner
Decision Tree Learner
Immagine_AreaChart
Image to Report (BIRT)
Decision Tree Predictor
DatiScoring
Data to Report (BIRT)
Pivot
Math Formula
Binner

Nodes

Extensions

Links