Icon

06_​Random_​Forest - Solution

06_Random_Forest - Solution
Exercise Random Forest1) Read letter-recognition.csv. This dataset was downloaded from UC Irvine Machine Learning Repository.Here, we have an image recognition problem. Each image contains an alphabet letter that is described by variousmeasures. Col0 contains the target class (the letter). All other input features are measures of the image.2) Train a Random Forest model to predict the alphabet letter in column Col0- Partition the dataset into a training set (80%) and a test set (20%). Perform stratified sampling on the target column.- Train a Random Forest model on the training set to predict values in the target column. Train 5 trees with minimum nodesize 2.- Apply the trained model to the test set- Evaluate the accuracy of the model by scoring metrics for a classification model3) OPTIONAL: train a Random Forest with 100 trees, and compare the performances of the two models The Concatenate node compares the twomodels performances in one single table.As we can see, more trees lead to betteroverall performance. Read dataletter-recognition.csvTrain the modelto predict the letter(5 trees)Apply the modelto the test setTop: train set (80%)Bottom: test set (20%)Stratified samplingon target columnTrain the modelto predict the letter(100 trees)Evaluate modelApply the modelto the test setEvaluate modelCompare modelperformances File Reader Random ForestLearner Random ForestPredictor Partitioning Random ForestLearner Scorer (JavaScript) Random ForestPredictor Scorer (JavaScript) Concatenate Exercise Random Forest1) Read letter-recognition.csv. This dataset was downloaded from UC Irvine Machine Learning Repository.Here, we have an image recognition problem. Each image contains an alphabet letter that is described by variousmeasures. Col0 contains the target class (the letter). All other input features are measures of the image.2) Train a Random Forest model to predict the alphabet letter in column Col0- Partition the dataset into a training set (80%) and a test set (20%). Perform stratified sampling on the target column.- Train a Random Forest model on the training set to predict values in the target column. Train 5 trees with minimum nodesize 2.- Apply the trained model to the test set- Evaluate the accuracy of the model by scoring metrics for a classification model3) OPTIONAL: train a Random Forest with 100 trees, and compare the performances of the two models The Concatenate node compares the twomodels performances in one single table.As we can see, more trees lead to betteroverall performance. Read dataletter-recognition.csvTrain the modelto predict the letter(5 trees)Apply the modelto the test setTop: train set (80%)Bottom: test set (20%)Stratified samplingon target columnTrain the modelto predict the letter(100 trees)Evaluate modelApply the modelto the test setEvaluate modelCompare modelperformancesFile Reader Random ForestLearner Random ForestPredictor Partitioning Random ForestLearner Scorer (JavaScript) Random ForestPredictor Scorer (JavaScript) Concatenate

Nodes

Extensions

Links