Icon

justknimeit-S2-09_​Victor 1-V2

justknimeit-S2-09_Victor

In this challenge, your goal is to see which features are the most important in predicting the quality of wine. After doing this analysis, you should create a visualization that shows the features' importances in order.

Graphical Exploratory Data Analysis (Trainingset only !) Data recoding: 4 classes Data recoding: 3 classes Random Forest training, prediction and features importance on 3-classes classification Random Forest training, prediction and features importance on 4-classes classification Scorer results:Accuracy = 87,812%Error = 12,188%Cohen's Kappa = 0,458 Scorer results:Accuracy = 71,875%Error = 28,125%Cohen's Kappa = 0,537 Description: In this challenge, your goal is to see which features are the most important in predicting the qualityof wine. After doing this analysis, you should create a visualization that shows the features'importances in order. Random Forest training, prediction and features importance on regression task Data recoding: quality as number (Double) Numeric scorerresults:R² = 0,47MAE = 0,411RMSE = 0,579 Import dataTrain/Testsplit 80:20HistogramqualityFeatures statisticsRecode quality 3&4 as 4Recode quality 7&8 as 7TESTConvert target stringquality into IntegerRecode quality 3&4 as 4Recode quality 7&8 as 7TRAINRF on train setRF predictionson test setScore RF on 4 classesRecode quality 3&4 as "Low"Recode quality 5&6 as "Medium"Recode quality 7&8 as "High"TESTRecode quality 3&4 as "Low"Recode quality 5&6 as "Medium"Recode quality 7&8 as "High"TRAINCompute global featuresimportanceCapture endfor global FICapture startfor global FIRF on train setRF predictionson test setScore RF on 3 classesCapture startfor global FICapture endfor global FICompute global featuresimportanceTrainTestRF on train setRF predictionson test setScore RFon regression taskCorrelationmatrixVisualizefeatures importanceCalculatesFeatures Importance CSV Reader Partitioning Histogram Statistics Rule Engine String To Number Rule Engine Random ForestLearner Random ForestPredictor Scorer Rule Engine Rule Engine Global FeatureImportance CaptureWorkflow End CaptureWorkflow Start Random ForestLearner Random ForestPredictor Scorer CaptureWorkflow Start CaptureWorkflow End Global FeatureImportance String To Number String To Number Random Forest Learner(Regression) Random Forest Predictor(Regression) Numeric Scorer Linear Correlation Bar Chart Features Importance Graphical Exploratory Data Analysis (Trainingset only !) Data recoding: 4 classes Data recoding: 3 classes Random Forest training, prediction and features importance on 3-classes classification Random Forest training, prediction and features importance on 4-classes classification Scorer results:Accuracy = 87,812%Error = 12,188%Cohen's Kappa = 0,458 Scorer results:Accuracy = 71,875%Error = 28,125%Cohen's Kappa = 0,537 Description: In this challenge, your goal is to see which features are the most important in predicting the qualityof wine. After doing this analysis, you should create a visualization that shows the features'importances in order. Random Forest training, prediction and features importance on regression task Data recoding: quality as number (Double) Numeric scorerresults:R² = 0,47MAE = 0,411RMSE = 0,579 Import dataTrain/Testsplit 80:20HistogramqualityFeatures statisticsRecode quality 3&4 as 4Recode quality 7&8 as 7TESTConvert target stringquality into IntegerRecode quality 3&4 as 4Recode quality 7&8 as 7TRAINRF on train setRF predictionson test setScore RF on 4 classesRecode quality 3&4 as "Low"Recode quality 5&6 as "Medium"Recode quality 7&8 as "High"TESTRecode quality 3&4 as "Low"Recode quality 5&6 as "Medium"Recode quality 7&8 as "High"TRAINCompute global featuresimportanceCapture endfor global FICapture startfor global FIRF on train setRF predictionson test setScore RF on 3 classesCapture startfor global FICapture endfor global FICompute global featuresimportanceTrainTestRF on train setRF predictionson test setScore RFon regression taskCorrelationmatrixVisualizefeatures importanceCalculatesFeatures ImportanceCSV Reader Partitioning Histogram Statistics Rule Engine String To Number Rule Engine Random ForestLearner Random ForestPredictor Scorer Rule Engine Rule Engine Global FeatureImportance CaptureWorkflow End CaptureWorkflow Start Random ForestLearner Random ForestPredictor Scorer CaptureWorkflow Start CaptureWorkflow End Global FeatureImportance String To Number String To Number Random Forest Learner(Regression) Random Forest Predictor(Regression) Numeric Scorer Linear Correlation Bar Chart Features Importance

Nodes

Extensions

Links