Icon

01. Building Churn Predictor - Exercise

<p><strong>Building a Churn Predictor</strong></p><p>This workflow is an example of how to train a basic machine learning model for a churn prediction task. In the optional steps, we integrate generative AI to create email drafts and to help us create a customized tree map visualization in Javascript.</p><p>An example is provided with a small Kaggle dataset previously used in marketing research: https://www.kaggle.com/becksddf/churn-in-telecoms-dataset.</p>

URL: Churn Prediction https://www.knime.org/knime-applications/churn-prediction

UTA Workshop - 10/03/25

Exercise 01 - Building a Churn Prediction

Learning objective: In this exercise, you'll learn how to build a simple churn predictor with a decision tree classifier. If you follow the optional steps, you'll learn how to create custom visualizations with KNIME, applying Javascript, and how to prompt an LLM for a simple task.


Workflow description: This workflow is an example of how to train a basic machine learning model for a churn prediction task (Churn = 1). In the optional steps, we integrate generative AI to create email drafts and generate a customized tree map visualization with Javascript.


You'll find the instructions to the exercises in the yellow annotations.

Step 1. Read datasets.


  1. Read the CallsData.xls file from the data folder with the Excel Reader node.

  2. Read the ContractData.csv from the data folder with the CSV Reader node.

Step 2. Data Preparation.


  1. With the Joiner node, inner join the two tables using attributes Area Code and Phone as matching criteria.

  2. Convert attribute Churn from number to string with the Number to String node.

Step 3. Train and Evaluate a Decision Tree classifier.


  1. Partition the dataset with the Table Partitioner node such that 80% of it is reserved for training and the remaining 20% is reserved for testing. Set Sampling Strategy as Stratified and Group Columnas Churn to guarantee that the class distributions for training and test are virtually the same.

  2. Create a Decision Tree classification model with the Decision Tree Learner node. Set Class column as Churn.

  3. Connect the output of the Decision Tree Learner node, and the test partition output from the Table Partitioner node, to the Decision Tree Predictor node.

Step 4. Model Evaluation.


  1. Use the Scorer node to evaluate the classification model.

Step 5. (Optional) Create a Tree Map Visualization


  1. Group phone numbers by state, area code, and predicted churn class with the GroupBy node. In tab Groups, select attributes State and Prediction (Churn). In tab Manual Aggregation, select attribute Phone and set Aggregation as Unique Count.

  2. With the Row Filter node, filter column Prediction (Churn) keeping only the values that are equal to 1.

  3. Convert column names Unique count(Phone) to Count(Phone) and Prediction(Churn) to Prediction_Churn with the Column Renamer node.

  4. Use the Generic Echarts View node to generate a tree map visualization for the top 10 states with the most predicted churns. If you do not want to create Javascript code with the help of K-AI inside this node, have a look at the Tree Map Help metanode. Alternatively, click Ask K-AI inside the node and let the copilot help you!

Step 6. Use an LLM to Create Messages to Customers that are Predicted to Churn


  1. Provide a username and password for the Get API Key component from http://tinyurl.com/KNIME-Agentic-Key. This will allow you to access OpenAI resources for this workshop free of charge.

  2. Connect the output of the Get API Key component, a flow variable, to the top left of the OpenAI Authenticator node. You'll see a red circle on top of the node, indicating the flow variable port of the OpenAI Authenticator node, as you drag the output of the Credentials Configuration node towards it.

  3. Once you're authenticated, select an LLM model with the OpenAI LLM Selector node. Set Model ID as gpt-4.1-nano, Maximum response length (token) as 200, and Temperature as 0.2.

  4. In parallel, connect the output of the Decision Tree Predictor node to a Row Filter node, filtering column Prediction (Churn) so that it only keeps the values that are equal to 1.

  5. Send the data to an Expression node to create a prompt for the LLM. If you're having trouble creating the prompt, check the metanode Prompt Help.

  6. Connect the outputs of the OpenAI LLM Selector node and Expression node to the LLM Prompter node.

Get API Key
Tree Map Help
Prompt Help

Nodes

Extensions

Links