Icon

04 Analyze Data by Training a Linear Regression

<p><strong>Analyze Data: Training a Linear Regression</strong></p><p>This workflow is an example of how to <strong>train and evaluate a basic machine learning model</strong> for a house price prediction task.</p><p>In this case, we train and apply a <strong>Linear Regression</strong> algorithm. However, the <em>Learner-Predictor</em> construct is common to all supervised algorithms.</p>

URL: KNIME Learning Center https://www.knime.com/learning
URL: KNIME Cheat Sheet: Building a KNIME workflow for beginners https://www.knime.com/cheat-sheets/building-knime-workflow-beginners
URL: KNIME Cheat Sheet: Machine learning with KNIME Analytics Platform https://www.knime.com/files/machine-learning-with-knime.pdf
URL: YouTube: Logistic Regression Learner and Predictor https://youtu.be/cS-4trTu_EA?si=aqZaPi7FqO_BlZVt
URL: YouTube: Behind the Scenes of Logistic Regression https://youtu.be/ozIDnbdFRS8?si=YJKRofGfFaowTzM7
URL: Webinar: KNIME101: Machine Learning for Beginners with KNIME https://www.knime.com/events/knime101-machine-learning-beginners-knime

Pre-processing (data preparation)

Partition the data into training set (80%) and test set (20%).

Model evaluation

Apply the trained Linear Regression to the test set with the "Regression Predictor" node. Evaluate the prediction using the "Numeric Scorer"node.

Model training


Train the Linear Regression with the "Linear Regression Learner" node.

Read data

The data contains various attributes about different houses and their price.

How to train a Linear Regression model?

Step 1: Drag the "Linear Regression Learner" node into the workflow and click on it to open the configuration window.

Step 2: Select the "Target" column as "SalesPrice".

Step 3: Execute the node. Investigate the view of the regression coefficients (click magnifier in the node action bar).

How to evaluate a Linear Regression model?

Step 1: Drag the "Regression Predictor" node into the workflow.

Step 2: Connect the output of "Linear Regression Learner" node to Port 0 andthe test set to Port 1. Execute the node.

Step 3: Connect the output containing the predictions to the "Missing Value" node to remove rows with missing prediction. Then, connect it to "Scorer" node to evaluate the model on various evaluation measures.

Analyze Data: Training a Linear Regression


This workflow is an example of how to train and evaluate a basic machine learning model for a house price prediction task.

In this case, we train and apply a Linear Regression algorithm. However, the Learner-Predictor construct is common to all supervised algorithms.

Workflow complete!

Keep the momentum going by exploring Just KNIME It! on the Hub to challenge yourself and see how these nodes can be integrated into more complex workflows and use cases.

Target:SalePrice
Linear Regression Learner
Port 0: Train Set (70%) Port 1: Test Set (30%)
Table Partitioner
Apply trainedLinear Regression
Regression Predictor
ReadAmesHousing_simple.csv
CSV Reader
Evaluate modelquality and performance
Numeric Scorer
Remove datawithout prediction
Missing Value

Nodes

Extensions

Links