Icon

Building a Simple Classifier

From adult data set (census data) predict income group from demographic attributes.

URL: KNIME's Learning page https://www.knime.com/learning

Simple Model Training for Classification

This workflow demonstrates how a simple classifier is built and applied to new data.

Find more information on KNIME’s Learning page at http://www.knime.com/learning (courses, tutorials, cheatsheets, books, and more)

Task

Predict the income group from demographic attributes of the adult data set (census data)

Data Partitioning

Create two separate partitions from original data set: training set (80%) and test set (20%). The test set deliberately consists of unseen data that will not be used for training.

Train a Model

This node builds a decision tree. Other Learner nodes train other models. Most Learner nodes output a PMML model (blue square output port).

Data Reading

Read the adult data set file. There is one row for each person, plus demographic info and the income group. The file is located in TheData/Basics/.

Apply the Model

Predictor nodes apply a specific model to a data set and append the model predictions.

Score the Model

Compute a confusion matrix between real and predicted class values and calculate the related accuracy measures.

Descriptive Statistics

Calculate the statistical properties of the data set attributes.

Visualize

Create interactive scatter plot.

Apply decision tree model to test set
Decision Tree Predictor
Random drawing 80% upper port 20% lower port
Table Partitioner
Train to predict class "income"
Decision Tree Learner
Stats and exploratory histograms in View
Statistics
Reading adult.csv
CSV Reader
Compute model accuracy
Scorer
Age vs. number-hours
Scatter Plot
CSV Reader

Nodes

Extensions

Links