Icon

justknimeit-S2-09_​Victor

In this challenge, your goal is to see which features are the most important in predicting the quality of wine. After doing this analysis, you should create a visualization that shows the features' importances in order.

Graphical Exploratory Data Analysis (Trainingset only !) Data recoding: 4 classes Data recoding: 3 classes Random Forest training, prediction and features importance on 3-classes classification Random Forest training, prediction and features importance on 4-classes classification Scorer results:Accuracy = 87,812%Error = 12,188%Cohen's Kappa = 0,458 Scorer results:Accuracy = 71,875%Error = 28,125%Cohen's Kappa = 0,537 Description: In this challenge, your goal is to see which features are the most important in predicting the qualityof wine. After doing this analysis, you should create a visualization that shows the features'importances in order. Import dataTrain/Testsplit 80:20HistogramqualityFeatures statisticsRecode quality 3&4 as 4Recode quality 7&8 as 7TESTConvert target stringquality into IntegerRecode quality 3&4 as 4Recode quality 7&8 as 7TRAINRF on train setRF predictionson test setScore RF on 4 classesRecode quality 3&4 as "Low"Recode quality 5&6 as "Medium"Recode quality 7&8 as "High"TESTRecode quality 3&4 as "Low"Recode quality 5&6 as "Medium"Recode quality 7&8 as "High"TRAINCompute global featuresimportanceCapture endfor global FICapture startfor global FIRF on train setRF predictionson test setScore RF on 3 classesCapture startfor global FICapture endfor global FICompute global featuresimportance CSV Reader Partitioning Histogram Statistics Rule Engine String To Number Rule Engine Random ForestLearner Random ForestPredictor Scorer Rule Engine Rule Engine Global FeatureImportance CaptureWorkflow End CaptureWorkflow Start Random ForestLearner Random ForestPredictor Scorer CaptureWorkflow Start CaptureWorkflow End Global FeatureImportance Graphical Exploratory Data Analysis (Trainingset only !) Data recoding: 4 classes Data recoding: 3 classes Random Forest training, prediction and features importance on 3-classes classification Random Forest training, prediction and features importance on 4-classes classification Scorer results:Accuracy = 87,812%Error = 12,188%Cohen's Kappa = 0,458 Scorer results:Accuracy = 71,875%Error = 28,125%Cohen's Kappa = 0,537 Description: In this challenge, your goal is to see which features are the most important in predicting the qualityof wine. After doing this analysis, you should create a visualization that shows the features'importances in order. Import dataTrain/Testsplit 80:20HistogramqualityFeatures statisticsRecode quality 3&4 as 4Recode quality 7&8 as 7TESTConvert target stringquality into IntegerRecode quality 3&4 as 4Recode quality 7&8 as 7TRAINRF on train setRF predictionson test setScore RF on 4 classesRecode quality 3&4 as "Low"Recode quality 5&6 as "Medium"Recode quality 7&8 as "High"TESTRecode quality 3&4 as "Low"Recode quality 5&6 as "Medium"Recode quality 7&8 as "High"TRAINCompute global featuresimportanceCapture endfor global FICapture startfor global FIRF on train setRF predictionson test setScore RF on 3 classesCapture startfor global FICapture endfor global FICompute global featuresimportance CSV Reader Partitioning Histogram Statistics Rule Engine String To Number Rule Engine Random ForestLearner Random ForestPredictor Scorer Rule Engine Rule Engine Global FeatureImportance CaptureWorkflow End CaptureWorkflow Start Random ForestLearner Random ForestPredictor Scorer CaptureWorkflow Start CaptureWorkflow End Global FeatureImportance

Nodes

Extensions

Links