Icon

Ames housing linear regression

Linear Regression - solution

Introduction to Machine Learning Algorithms course - Session 2
Solution to exercise 1
- Partition data into training and test set
- Train a linear regression model
- Apply the trained model to the test set
- Handle missing values
- Evaluate the model performance with the Numeric Scorer node


Housing Price Prediction with a Linear Regression ModelIn this example we will predict the price of a house in Ames (Iowa, USA) given a number of features: size & quality1. Loading the data with the CSV reader node2. Plot the target vs other variables with the Scatter Plot node3. Paritioning the data into the training (70%) & testing (30%) partitions4. Train a linear regression model with the Linear Regression Learner node5. Generate predictions with the Regression Predictor node6. Remove missing predictions with the Missing Value node7. Assess the model performance with the Numeric Scorer node Model 1:Using the original data without any treatment Model 2:Outliers are removed Training - 70%Testing - 30%Assessingthe quality ofthe modelTraining a linearregression modelPredictionsbased on thetrained modelhousing datasetRemovingmissingpredictionsSale price vsother variablesExploringvariablesOutliers areremovedBox plotSale price vsother variablesRemovingmissingpredictionsAssessingthe quality ofthe modelTraining a linearregression modelTraining - 70%Testing - 30%Predictionsbased on thetrained model Partitioning Numeric Scorer Linear RegressionLearner RegressionPredictor CSV Reader Missing Value Scatter Plot Data Explorer Numeric Outliers Box Plot Scatter Plot Missing Value Numeric Scorer Linear RegressionLearner Partitioning RegressionPredictor Housing Price Prediction with a Linear Regression ModelIn this example we will predict the price of a house in Ames (Iowa, USA) given a number of features: size & quality1. Loading the data with the CSV reader node2. Plot the target vs other variables with the Scatter Plot node3. Paritioning the data into the training (70%) & testing (30%) partitions4. Train a linear regression model with the Linear Regression Learner node5. Generate predictions with the Regression Predictor node6. Remove missing predictions with the Missing Value node7. Assess the model performance with the Numeric Scorer node Model 1:Using the original data without any treatment Model 2:Outliers are removed Training - 70%Testing - 30%Assessingthe quality ofthe modelTraining a linearregression modelPredictionsbased on thetrained modelhousing datasetRemovingmissingpredictionsSale price vsother variablesExploringvariablesOutliers areremovedBox plotSale price vsother variablesRemovingmissingpredictionsAssessingthe quality ofthe modelTraining a linearregression modelTraining - 70%Testing - 30%Predictionsbased on thetrained model Partitioning Numeric Scorer Linear RegressionLearner RegressionPredictor CSV Reader Missing Value Scatter Plot Data Explorer Numeric Outliers Box Plot Scatter Plot Missing Value Numeric Scorer Linear RegressionLearner Partitioning RegressionPredictor

Nodes

Extensions

Links