Icon

03_​Random_​Forest_​solution

Random Forest - solution

Introduction to Machine Learning Algorithms course - Session 2
Solution to exercise 3
- Train a random forest model
- Apply the model to the test set
- Evaluate the model performance with the Scorer node
- Perform Pparameter optimization



Exercise: Random Forest1) Train a Random Forest model to predict the overall condition of a house (high/low) (Random Forest Learner node)- Select the "rank" column as the target column- Leave other settings to their defaults2) Use the trained model to predict the rank of the houses in the test set (Random Forest Predictor node)3) Evaluate the accuracy of the random forest model (Scorer node)- Select "rank" as the actual column and "Prediction (rank)" as the predicted column- What is the accuracy of the model?Optional: Build a parameter optimization loop to find the best settings for parameters tree depth and number of models.1) Use a Parameter Optimization Loop Start node to define the possible values for the tree depth and the number of models- Connect the variable port to the Random Forest Learner node- Use the created flow variables to overwrite the according setting option in the Random Forest Learner node. 2) Use the Parameter Optimization Loop End node to define the accuracy as the objective function- Which settings lead to the model with highest accuracy? Classification: Random Forest Optional: Parameter optimization loop Read AmesHousing.csvRandom ForestLearner Parameter OptimizationLoop Start Random ForestPredictor ParameterOptimization Loop End CSV Reader Preprocessing Scorer Exercise: Random Forest1) Train a Random Forest model to predict the overall condition of a house (high/low) (Random Forest Learner node)- Select the "rank" column as the target column- Leave other settings to their defaults2) Use the trained model to predict the rank of the houses in the test set (Random Forest Predictor node)3) Evaluate the accuracy of the random forest model (Scorer node)- Select "rank" as the actual column and "Prediction (rank)" as the predicted column- What is the accuracy of the model?Optional: Build a parameter optimization loop to find the best settings for parameters tree depth and number of models.1) Use a Parameter Optimization Loop Start node to define the possible values for the tree depth and the number of models- Connect the variable port to the Random Forest Learner node- Use the created flow variables to overwrite the according setting option in the Random Forest Learner node. 2) Use the Parameter Optimization Loop End node to define the accuracy as the objective function- Which settings lead to the model with highest accuracy? Classification: Random Forest Optional: Parameter optimization loop Read AmesHousing.csvRandom ForestLearner Parameter OptimizationLoop Start Random ForestPredictor ParameterOptimization Loop End CSV Reader Preprocessing Scorer

Nodes

Extensions

Links