Icon

08 Regression Model

08 Regression Model - Solution
Exercise: Linear Regression and Numeric Scoring Metrics1) Read the adult_joined.table file by executing the Table Reader and Missing Value nodes2) Partition the data into a training set (75 %) and test set (25 %). Draw randomly.3) Train a linear regression model on the training set to predict the weekly working hours. Use all other columns but the "ID"column for the prediction.4) Apply the model to the test set5) Evaluate the performance of the linear regression model with the Numeric Scorer node. Which proportion of the varianceof the weekly working hours does the model explain? How many hours is the mean absolute error of the model? The proportion of the varianceexplained is represented by the R^2metric, here about 20 %.The mean absolute error metricreports the average error in hours,here about 8 hours.NOTE: due to random partitioning,these values might slightly change atevery execution Top: train set (75%)Bottom: test set (25%)Random samplingTrain the modelto predict hours per weekApply the modelto the test setEvaluate modelperformanceRead data adult_joined.table Partitioning Linear RegressionLearner RegressionPredictor Numeric Scorer Missing Value Table Reader Exercise: Linear Regression and Numeric Scoring Metrics1) Read the adult_joined.table file by executing the Table Reader and Missing Value nodes2) Partition the data into a training set (75 %) and test set (25 %). Draw randomly.3) Train a linear regression model on the training set to predict the weekly working hours. Use all other columns but the "ID"column for the prediction.4) Apply the model to the test set5) Evaluate the performance of the linear regression model with the Numeric Scorer node. Which proportion of the varianceof the weekly working hours does the model explain? How many hours is the mean absolute error of the model? The proportion of the varianceexplained is represented by the R^2metric, here about 20 %.The mean absolute error metricreports the average error in hours,here about 8 hours.NOTE: due to random partitioning,these values might slightly change atevery execution Top: train set (75%)Bottom: test set (25%)Random samplingTrain the modelto predict hours per weekApply the modelto the test setEvaluate modelperformanceRead data adult_joined.tablePartitioning Linear RegressionLearner RegressionPredictor Numeric Scorer Missing Value Table Reader

Nodes

Extensions

Links