Icon

KNIME_​challenge24_​solution

KNIME_challenge23_solution
Version 1: Standard, no tuning or data prep. AutoML Partition of dataset for testing modelaccuracy Partition of dataset meant for trainingmodel Challenge 24: Modeling Churn Predictions - Part 2Level: Easy to MediumDescription: Just like in last week’s challenge, a telecom company wants you to predict which customers are going to churn (that is, going to canceltheir contracts) based on attributes of their accounts. One of your colleagues said that she was able to achieve a bit over 95% accuracy for the test datawithout modifying the training data at all, and using all given attributes exactly as they are. Again, the target class to be predicted is Churn (value 0corresponds to customers that do not churn, and 1 corresponds to those who do). What model should you train over the training dataset to obtain thisaccuracy over the test dataset? Can this decision be automated? Note 1: A simple, automated solution to this challenge consists of 5 nodes. Note 2: Inthis challenge, do not change the statistical distribution of any attribute or class in the datasets, and use all available attributes. Note 3: Need morehelp to understand the problem? Check this blog post out.Author: Aline Bessa Version2: oversampling minority groups in training data Readchurn_problem_training_data.csvformatchurn as strformatchurn as strReadchurn_problem_test_data.csvpredictchurnlearn and predictchurnAccuracy = 95.05%Accuracy = 95.65%Node 48predictchurnlearn and predictchurn CSV Reader Table Manipulator Table Manipulator CSV Reader Workflow Executor AutoML Scorer (JavaScript) Scorer (JavaScript) SMOTE Workflow Executor AutoML Version 1: Standard, no tuning or data prep. AutoML Partition of dataset for testing modelaccuracy Partition of dataset meant for trainingmodel Challenge 24: Modeling Churn Predictions - Part 2Level: Easy to MediumDescription: Just like in last week’s challenge, a telecom company wants you to predict which customers are going to churn (that is, going to canceltheir contracts) based on attributes of their accounts. One of your colleagues said that she was able to achieve a bit over 95% accuracy for the test datawithout modifying the training data at all, and using all given attributes exactly as they are. Again, the target class to be predicted is Churn (value 0corresponds to customers that do not churn, and 1 corresponds to those who do). What model should you train over the training dataset to obtain thisaccuracy over the test dataset? Can this decision be automated? Note 1: A simple, automated solution to this challenge consists of 5 nodes. Note 2: Inthis challenge, do not change the statistical distribution of any attribute or class in the datasets, and use all available attributes. Note 3: Need morehelp to understand the problem? Check this blog post out.Author: Aline Bessa Version2: oversampling minority groups in training data Readchurn_problem_training_data.csvformatchurn as strformatchurn as strReadchurn_problem_test_data.csvpredictchurnlearn and predictchurnAccuracy = 95.05%Accuracy = 95.65%Node 48predictchurnlearn and predictchurn CSV Reader Table Manipulator Table Manipulator CSV Reader Workflow Executor AutoML Scorer (JavaScript) Scorer (JavaScript) SMOTE Workflow Executor AutoML

Nodes

Extensions

Links