Icon

04 Advanced Machine Learning

Advanced Machine Learning - Exercise

This workflow shows a hands-on exercise in the L2-DS Introduction to KNIME Analytics Platform for Data Scientists - Advanced course

Task 2: Parameter Optimization Loop1. Start with a Parameter Optimization Loop Start node. Create a parameter for the number of treeswith start value=50, end value=150, and increment=10. It is an integer.2. Overwrite the number of models setting in the Random Forest Learner (Regression) node with thisparameter3. Transform the numeric scoring metrics into flow variables. Use the Table Column to Variable node.4. End with a Parameter Optimization Loop End node. Use MAPE as the objective value. Task 1: Random Forest (Regression)1. Execute the workflow. It reads and preprocesses the data about Airbnb listings in New York City, NYin 2019.2. Partition the data into a training set (70%) and a test set (30%). Apply random sampling.3. Train a Random Forest (Regression) model to predict the price column4. Apply the model to the test set5. Evaluate the model's performance with the Numeric Scorer node Read AB_NYC_2019dataPredict priceR2 and error metricsControlnr of modelsMinimize MAPE InteractiveData Cleaning CSV Reader Random Forest Learner(Regression) Random Forest Predictor(Regression) Numeric Scorer Parameter OptimizationLoop Start ParameterOptimization Loop End Partitioning Replace 0 price byneighborhood average Table Columnto Variable Task 2: Parameter Optimization Loop1. Start with a Parameter Optimization Loop Start node. Create a parameter for the number of treeswith start value=50, end value=150, and increment=10. It is an integer.2. Overwrite the number of models setting in the Random Forest Learner (Regression) node with thisparameter3. Transform the numeric scoring metrics into flow variables. Use the Table Column to Variable node.4. End with a Parameter Optimization Loop End node. Use MAPE as the objective value. Task 1: Random Forest (Regression)1. Execute the workflow. It reads and preprocesses the data about Airbnb listings in New York City, NYin 2019.2. Partition the data into a training set (70%) and a test set (30%). Apply random sampling.3. Train a Random Forest (Regression) model to predict the price column4. Apply the model to the test set5. Evaluate the model's performance with the Numeric Scorer node Read AB_NYC_2019dataPredict priceR2 and error metricsControlnr of modelsMinimize MAPE InteractiveData Cleaning CSV Reader Random Forest Learner(Regression) Random Forest Predictor(Regression) Numeric Scorer Parameter OptimizationLoop Start ParameterOptimization Loop End Partitioning Replace 0 price byneighborhood average Table Columnto Variable

Nodes

Extensions

Links