Icon

Santander customer satisfaction using decision tree

Santander customer satisfaction using decision tree
Last amended: 1st //january, 2020Title: Santander Customer Satisfaction Which customers are happy customers? Problem Description:From frontline support teams to C-suites, customer satisfaction is a key measure of success. Unhappy customers don't stick around. What'smore, unhappy customers rarely voice their dissatisfaction before leaving.Santander Bank is asking Kagglers to help them identify dissatisfied customers early in their relationship. Doing so would allow Santander totake proactive steps to improve a customer's happiness before it's too late.In this problem, you'll work with hundreds of anonymized features to predict if a customer is satisfied or dissatisfied with their bankingexperience. Ref: Kaggle: https://www.kaggle.com/c/santander-customer-satisfaction Consider '1' as Positive:1. Measure Recall, Precision and F1 scores2. Results with ADASYN are better than directly with decision tree3. Results are better with MDL pruning and Gini index.4. Do not try Gain Ratio in DT. Takes very long time. Color rowsas perTARGETvauesApply modelSplit data60/40 or 70/30Confusion matrix:Actual values are horizontalPredicted are verticalCalculates statistic measures: mean, max, min, variance, median, etc.Train modelSantanderdatafrom KaggleremoveIDcolumnBalance datamin-maxnormalizationTry z-scorealsoRemove anyconstant columnsstratifiedsamplingtrain=40%transform TARGETfor stratifiedsamplingTransform allcat variablesto StringJoinROWID wiseBalance dataNode 43Get total countsof 1 and 0 in test data discover multplecategorical columns Color Manager Decision TreePredictor Partitioning Scorer Statistics DecisionTree Learner CSV Reader Column Filter SMOTE Normalizer Constant ValueColumn Filter Row Sampling Number To String Number To String Joiner ADASYN--Balancingimbalanced data Timer Info Bar Chart Last amended: 1st //january, 2020Title: Santander Customer Satisfaction Which customers are happy customers? Problem Description:From frontline support teams to C-suites, customer satisfaction is a key measure of success. Unhappy customers don't stick around. What'smore, unhappy customers rarely voice their dissatisfaction before leaving.Santander Bank is asking Kagglers to help them identify dissatisfied customers early in their relationship. Doing so would allow Santander totake proactive steps to improve a customer's happiness before it's too late.In this problem, you'll work with hundreds of anonymized features to predict if a customer is satisfied or dissatisfied with their bankingexperience. Ref: Kaggle: https://www.kaggle.com/c/santander-customer-satisfaction Consider '1' as Positive:1. Measure Recall, Precision and F1 scores2. Results with ADASYN are better than directly with decision tree3. Results are better with MDL pruning and Gini index.4. Do not try Gain Ratio in DT. Takes very long time. Color rowsas perTARGETvauesApply modelSplit data60/40 or 70/30Confusion matrix:Actual values are horizontalPredicted are verticalCalculates statistic measures: mean, max, min, variance, median, etc.Train modelSantanderdatafrom KaggleremoveIDcolumnBalance datamin-maxnormalizationTry z-scorealsoRemove anyconstant columnsstratifiedsamplingtrain=40%transform TARGETfor stratifiedsamplingTransform allcat variablesto StringJoinROWID wiseBalance dataNode 43Get total countsof 1 and 0 in test data discover multplecategorical columns Color Manager Decision TreePredictor Partitioning Scorer Statistics DecisionTree Learner CSV Reader Column Filter SMOTE Normalizer Constant ValueColumn Filter Row Sampling Number To String Number To String Joiner ADASYN--Balancingimbalanced data Timer Info Bar Chart

Nodes

Extensions

Links