Icon

Challenge 25 - Modeling Churn Predictions III

Challenge 25 - Modeling Churn Predictions III
Challenge 25: Modeling Churn Predictions III Description: In this challenge series, the goal is to predict which customers of a certain telecom company are going to churn (thatis, going to cancel their contracts) based on attributes of their accounts. Here, the target class to be predicted is Churn (value 0corresponds to customers that do not churn, and 1 corresponds to those who do).After automatically picking a classification model for the task, you achieved an accuracy of about 95% for the test data, but themodel does not perform uniformly for both classes. In fact, it is better at predicting when a customer will not churn (Churn = 0) thanwhen they will (Churn = 1). This imbalance can be verified by looking at how precision and recall differ for these two classes, or bychecking how metric Cohen’s kappa is a bit lower than 80% despite a very high accuracy. How can you preprocess and re-sample the training data in order to make the classification a bit more powerful for class Churn = 1? Note 1: Need more help to understand the problem? Check this blog post out. Note 2: This problem is hard: do not expect to see a major performance increase for class Churn = 1. Also, verifying if theperformance increase is statistically significant will not be trivial. Still... give this challenge your best try! DATA INPUT TRAIN MODELS & PREDICTION CHECK RESULTS H2O AutoML - StackedEnsembleRead Testing Data ~20%ReadTraining Data ~80%Node 3Acc: 96.1 %Cohen's kappa: 83.4 %K = 5Oversampleminority class(Churn = 1)AutoML CSV Reader CSV Reader Workflow Executor Scorer (JavaScript) SMOTE Challenge 25: Modeling Churn Predictions III Description: In this challenge series, the goal is to predict which customers of a certain telecom company are going to churn (thatis, going to cancel their contracts) based on attributes of their accounts. Here, the target class to be predicted is Churn (value 0corresponds to customers that do not churn, and 1 corresponds to those who do).After automatically picking a classification model for the task, you achieved an accuracy of about 95% for the test data, but themodel does not perform uniformly for both classes. In fact, it is better at predicting when a customer will not churn (Churn = 0) thanwhen they will (Churn = 1). This imbalance can be verified by looking at how precision and recall differ for these two classes, or bychecking how metric Cohen’s kappa is a bit lower than 80% despite a very high accuracy. How can you preprocess and re-sample the training data in order to make the classification a bit more powerful for class Churn = 1? Note 1: Need more help to understand the problem? Check this blog post out. Note 2: This problem is hard: do not expect to see a major performance increase for class Churn = 1. Also, verifying if theperformance increase is statistically significant will not be trivial. Still... give this challenge your best try! DATA INPUT TRAIN MODELS & PREDICTION CHECK RESULTS H2O AutoML - StackedEnsembleRead Testing Data ~20%ReadTraining Data ~80%Node 3Acc: 96.1 %Cohen's kappa: 83.4 %K = 5Oversampleminority class(Churn = 1)AutoML CSV Reader CSV Reader Workflow Executor Scorer (JavaScript) SMOTE

Nodes

Extensions

Links