Icon

Automation of Data Prep and Modeling



An example learning a churn prediction model. The data is
describing customers and their calling behavior, including minutes spent
on their contract, support cases and other type of interactions between
customer and provider. Objective is to build a model identifying
customers who are likely to churn so that targeted campaigns can be
conducted.

Train and Optimize Pre-processing - Join contract data and behavioral data - Convert Churn values to String to be used as class in upcoming classification - Reserve 80% of the rows for model training and remaining for model testing - Use same number of data rows for both classes in testing test set Reading and Blending - contract data + churn - behavioral (calls) data Score and DeployEvaluate predictions based on confusionmatrix and ROC. train set - Missing value imputation modelling - Optimize Random Forest parameters - Optimize threshold Binary Classification - Train model with optimized parameters - Capture branch to deploy Churn = 0 customerremained with contractChurn = 1 customer quitcontract ReadingContractData.csvJoin the contract data and the behavioral dataArea code and churn are converted to String. optimized modelapply newthreshold ParameterOptimization DB Table Selector DB Connector File Reader Joiner Number To String Random ForestLearner Domain Calculator Random ForestPredictor Missing Value Missing Value(Apply) Rule Engine Database URL andCredentials DB Reader Inspect Classifier Sampling Confusion Matrix -Default vs. Optimized Train and Optimize Pre-processing - Join contract data and behavioral data - Convert Churn values to String to be used as class in upcoming classification - Reserve 80% of the rows for model training and remaining for model testing - Use same number of data rows for both classes in testing test set Reading and Blending - contract data + churn - behavioral (calls) data Score and DeployEvaluate predictions based on confusionmatrix and ROC. train set - Missing value imputation modelling - Optimize Random Forest parameters - Optimize threshold Binary Classification - Train model with optimized parameters - Capture branch to deploy Churn = 0 customerremained with contractChurn = 1 customer quitcontract ReadingContractData.csvJoin the contract data and the behavioral dataArea code and churn are converted to String. optimized modelapply newthreshold ParameterOptimization DB Table Selector DB Connector File Reader Joiner Number To String Random ForestLearner Domain Calculator Random ForestPredictor Missing Value Missing Value(Apply) Rule Engine Database URL andCredentials DB Reader Inspect Classifier Sampling Confusion Matrix -Default vs. Optimized

Nodes

Extensions

Links