Icon

P2.1.8b- Polynomial Regression - Normalisation

Regression Models
Practical 2.1.8b - Polynomial Regression

Learning objective: In this exercise you'll learn how predict the price of a house in Ames (Iowa, USA) given a number of features: size, neighborhood, heating...


Workflow description: This workflow uses a dataset that describes the sale of individual residential properties in Ames, Iowa from 2006 to 2010. One of the columns is the overall condition ranking, with values between 1 and 10.


You'll find the instructions to the exercises in the yellow annotations.

Step 1. Exploratory analysis Use the 'Statistics' and 'Statistics View' nodes to explore the data. Which are your main observations? Consider what to do with missing values and use 'Missing Value' and 'Missing Value (Apply)' nodes to handle.
Step 2. Partitioning Add Partitioning node to CSV Reader output port: Top port should have 70 % of the rows Draw randomly such rows. Delete records with missing values first.
Step 3. Polynomial Regression Learner Add Polynomil Regression Learner node to top output port of Partitioning node: Select price column to be learned Execute the node.
Data Preparation

Step 4.Regression Predictor Add Regression Predictor node. Predict test set (remaining 30% rows) by simply connecting the remaining unconnected output ports
Step 5. Model evaluationApply denormalisation
Step 6. Model evaluationAdd Numeric Scorer node to the Regression Predictor output port:Reference Column: the column you learnedPredicted Column: the new column created by the predictor node
Polynomial Regression Learner
Joiner
Regression Predictor
CSV Reader
Column Filter
Statistics
Denormalizer
Missing Value
Denormalizer
Numeric Scorer
Normalizer
Statistics View
Normalizer (Apply)
70% training 30% testing
Table Partitioner
Column Renamer
Column Renamer
Missing Value (Apply)
Column Filter
Linear Correlation

Nodes

Extensions

Links