Icon

4. Clustering and Regression

<p><strong>Clustering and Regression</strong></p><p>This workflow reads the two datasets (training and test set) created in the workflow "1. Data Preparation" and shows a few more machine learning algorithms:</p><ol><li><p>A linear regression to predict the number of hours/week given all other attribute values (<em>Linear Regression Learner</em> and <em>Regression Predictor</em> nodes).</p></li><li><p>A k-Means clustering to detect patterns in the dataset by grouping together the most similar data rows.</p></li></ol><p>Remember, that k-Means, like Neural Networks, needs normalized data.</p><p>The <em>Statistics </em>node mainly produces general statistical measures about one data column, including a roughly drawn histogram.</p>

URL: KNIME Beginner's Luck (Book Homepage) https://www.knime.com/knimepress/beginners-luck

Workflow: Clustering and Regression


This workflow reads the two datasets (training and test set) created in the workflow "1. Data Preparation" and shows a few more machine learning algorithms:

  1. A linear regression to predict the number of hours/week given all other attribute values (Linear Regression Learner and Regression Predictor nodes).

  2. A k-Means clustering to detect patterns in the dataset by grouping together the most similar data rows.

Remember, that k-Means, like Neural Networks, needs normalized data.

The Statistics node mainly produces general statistical measures about one data column, including a roughly drawn histogram.

Reading data

Training and test set created in workflow "1. Data Preparation".

Training machine learning models

Applying trained models

Preprocessing

The k-Means algorithm needs normalized data

Calculating statistics

adult_test_set.csv
CSV Reader
Run predictions from model
Regression Predictor
adult_training_set.csv
CSV Reader
General stats
Statistics
3 clusters with Euclidean distance
k-Means (deprecated)
Predict hours/weekon all remaining attributes
Linear Regression Learner
assign new data to clusters
Cluster Assigner
Min-Max-Normalizationin [0,1]
Normalizer

Nodes

Extensions

Links