Icon

04 Advanced Machine Learning

Date and Time and Databases - Exercise

This workflow shows a hands-on exercise in the L2-DS Introduction to KNIME Analytics Platform for Data Scientists - Advanced course

Task 2: Parameter Optimization Loop1. Start with a Parameter Optimization Loop Start node. Create a parameterfor the number of trees with start value=50, end value=150, andincrement=10. It is an integer.2. Overwrite the number of models setting in the Random Forest Learner(Regression) node with this parameter3. Transform the numeric scoring metrics into flow variables. Use the TableColumn to Variable node.4. End with a Parameter Optimization Loop End node. Use MAPE as theobjective value. Task 1: Random Forest (Regression)1. Execute the workflow. It reads and preprocesses the data about Airbnblistings in New York City, NY in 2019.2. Partition the data into a training set (70%) and a test set (30%). Applyrandom sampling.3. Train a Random Forest (Regression) model to predict the price column4. Apply the model to the test set5. Evaluate the model's performance with the Numeric Scorer node Read AB_NYC_2019dataPredict priceNode 99Node 100Node 103Node 104Node 105Node 106 InteractiveData Cleaning CSV Reader Random Forest Learner(Regression) 49% Replace 0 price byneighborhood average Parameter OptimizationLoop Start Partitioning Numeric Scorer queued Table Columnto Variable queued ParameterOptimization Loop End queued Random Forest Predictor(Regression) queued Task 2: Parameter Optimization Loop1. Start with a Parameter Optimization Loop Start node. Create a parameterfor the number of trees with start value=50, end value=150, andincrement=10. It is an integer.2. Overwrite the number of models setting in the Random Forest Learner(Regression) node with this parameter3. Transform the numeric scoring metrics into flow variables. Use the TableColumn to Variable node.4. End with a Parameter Optimization Loop End node. Use MAPE as theobjective value. Task 1: Random Forest (Regression)1. Execute the workflow. It reads and preprocesses the data about Airbnblistings in New York City, NY in 2019.2. Partition the data into a training set (70%) and a test set (30%). Applyrandom sampling.3. Train a Random Forest (Regression) model to predict the price column4. Apply the model to the test set5. Evaluate the model's performance with the Numeric Scorer node Read AB_NYC_2019dataPredict priceNode 99Node 100Node 103Node 104Node 105Node 106 InteractiveData Cleaning CSV Reader Random Forest Learner(Regression) 49% Replace 0 price byneighborhood average Parameter OptimizationLoop Start Partitioning Numeric Scorer queued Table Columnto Variable queued ParameterOptimization Loop End queued Random Forest Predictor(Regression) queued

Nodes

Extensions

Links