Icon

08 Regression Model

08 Regression Model - Solution

Solution to an exercise for training a model for numeric prediction.

Train and apply a linear regression model. Evaluate the performance with numeric scoring metrics.

CHECK YOUR ANSWERS:
a. The model explains about 20% of the variance of the weekly working hours (R-squared)
b. The mean absolute error of the model is about 8 hours



Exercise: Linear Regression and Numeric Scoring Metrics1) Read the adult_joined.table file by executing the Table Reader and Missing Value nodes2) Partition the data into a training set (75 %) and test set (25 %). Draw randomly.3) Train a linear regression model on the training set to predict the weekly working hours. Use all other columns but the "ID" column for the prediction.4) Apply the model to the test set5) Evaluate the performance of the linear regression model with the Numeric Scorer node. Which proportion of the variance of the weekly working hours does the modelexplain? How many hours is the mean absolute error of the model? The proportion of the variance explained isrepresented by the R^2 metric, here about 20 %.The mean absolute error metric reports the averageerror in hours, here about 8 hours. Top: train set (75%)Bottom: test set (25%)Random samplingTrain the model to predict hours per weekApply the modelto the test setEvaluate modelperformanceRead data adult_joined.tablePartitioning Linear RegressionLearner RegressionPredictor Numeric Scorer Missing Value Table Reader Exercise: Linear Regression and Numeric Scoring Metrics1) Read the adult_joined.table file by executing the Table Reader and Missing Value nodes2) Partition the data into a training set (75 %) and test set (25 %). Draw randomly.3) Train a linear regression model on the training set to predict the weekly working hours. Use all other columns but the "ID" column for the prediction.4) Apply the model to the test set5) Evaluate the performance of the linear regression model with the Numeric Scorer node. Which proportion of the variance of the weekly working hours does the modelexplain? How many hours is the mean absolute error of the model? The proportion of the variance explained isrepresented by the R^2 metric, here about 20 %.The mean absolute error metric reports the averageerror in hours, here about 8 hours. Top: train set (75%)Bottom: test set (25%)Random samplingTrain the model to predict hours per weekApply the modelto the test setEvaluate modelperformanceRead data adult_joined.tablePartitioning Linear RegressionLearner RegressionPredictor Numeric Scorer Missing Value Table Reader

Nodes

Extensions

Links