Icon

02_​Regression_​Tree_​exercise

Regression Tree - exercise

Introduction to Machine Learning Algorithms course - Session 2
Exercise 2
- Partition data into train and test set
- Train a Regression Tree model
- Apply the trainel model to the test set
- Handle missing values
- Evaluate the model performances with the Numeric Scorer

URL: Slides (Introduction to ML Algorithms course) https://www.knime.com/form/material-download-registration

Session 2 - Regression Models, Ensemble Models, & Logistic Regression

Exercise 02 Regression Tree

Learning objective: In this exercise you'll learn how predict the price of a house in Ames (Iowa, USA) given a number of features: size, neighborhood, heating...


Workflow description: This workflow uses a dataset that describes the sale of individual residential properties in Ames, Iowa from 2006 to 2010. One of the columns is the overall condition ranking, with values between 1 and 10.


You'll find the instructions to the exercises in the yellow annotations.

Step 1. Partitioning

Add Partitioning node to CSV Reader output port:

  • Top port should have 70 % of the rows

  • Draw randomly such rows


Step 2. Simple Regression Tree Learner

Add Simple Regression Tree Learner node to top output port of Partitioning node:

  • Select price column to be learned

  • Execute the node and open its decision tree view. Which column is used in the beginning of the tree?


Step 3. Simple Regression Tree Predictor

Add Simple Regression Tree Predictor node:

  • Predict test set (remaining 30% rows) by simply connecting the remaining unconnected output ports


Data Preparation

Step 4. Check Missing Values

Remove rows with missing prediction (Missing Value node)


Step 5. Model evaluation

Add Numeric Scorer node to the Regression Predictor output port:

  • Reference Column: the column you learned

  • Predicted Column: the new column created by the predictor node


Simple Regression Tree Predictor
Missing Value
Numeric Scorer
housing dataset
CSV Reader
Table Partitioner
overall quality
Simple Regression Tree Learner

Nodes

Extensions

Links