Icon

LinearRegression-LegoSet

Linear Regression to calculate Price of a Lego set

The workflow trains a Linear Regression model to predict the price of a Lego set based on it's various features. It includes the process of:
- Loading the Lego dataset and cleaning it to remove missing values and outliers
- Remove collinearity between independent variables
- Partitioning the dataset into train and test dataset
- Modelling a Linear Regressor and prediciting sales of lego sets in test data
- Calculate the accuracy metrics of the model
- Plot residual plot and histogram to visualize Linear Regression assumptions of homoscedasticity (constant variance) and normal distribution of error.

Data Preprocessing and Cleaning Calculating Correlation between independent features andremoving the features that are correlated to each other to removemulti-collinearity Selecting independent features and partitioning dataset intotraining and testing data Training the model with Train data and Predicting the target value for test data Visualizing the Residual Error Linear Regression on Lego Dataset- Load the Lego Dataset. The task is to predict the Price of Legos.- Clean the features- Remove multicollinearity- Partition the data into training and testing data- Train the model with train data and Predict the List_Price for test data- Calculate the metrics for the model- Plot a Residual Plot for the model Residual ErrorResidual PlotNormal DistributionforResidual ErrorMap Categorical valuestoNumericalRemove outliersFill Missing ValuesSpearman's collinearitycoefficientsFilter columns with collinearityabove 0.45Filter out the columnswith high collinearityUpper Partition - Training DataLower Partition - Testing DataCaluclating the metricsfor the model Math Formula Scatter Plot Histogram File Reader Category To Number Numeric Outliers Missing Value Rank Correlation Correlation Filter Column Filter Partitioning Linear RegressionLearner RegressionPredictor Numeric Scorer Data Preprocessing and Cleaning Calculating Correlation between independent features andremoving the features that are correlated to each other to removemulti-collinearity Selecting independent features and partitioning dataset intotraining and testing data Training the model with Train data and Predicting the target value for test data Visualizing the Residual Error Linear Regression on Lego Dataset- Load the Lego Dataset. The task is to predict the Price of Legos.- Clean the features- Remove multicollinearity- Partition the data into training and testing data- Train the model with train data and Predict the List_Price for test data- Calculate the metrics for the model- Plot a Residual Plot for the model Residual ErrorResidual PlotNormal DistributionforResidual ErrorMap Categorical valuestoNumericalRemove outliersFill Missing ValuesSpearman's collinearitycoefficientsFilter columns with collinearityabove 0.45Filter out the columnswith high collinearityUpper Partition - Training DataLower Partition - Testing DataCaluclating the metricsfor the modelMath Formula Scatter Plot Histogram File Reader Category To Number Numeric Outliers Missing Value Rank Correlation Correlation Filter Column Filter Partitioning Linear RegressionLearner RegressionPredictor Numeric Scorer

Nodes

Extensions

Links