Icon

Building a Simple Classifier

Simple Model Training for Classification
Simple Model Training for ClassificationThis workflow demonstrates how a simple classifier is built and applied to new data. It also illustrates the use of KNIME's hiliting capabilities, which allow interactive views to be connected within the same workflow.Task Predict the income group from demographic attributes of the adult data set (census data).Find more information on KNIME's Learning Hub at http://www.knime.org/learning-hub (tutorials, videos, white papers, many more workflows) Data ReadingRead the adult data set file. There isone row for each person, plusdemographic info and the incomegroup. The file is located in TheData/Basics/ Graphical PropertiesAssign colors by income group. Data PartitioningCreate two separate partitionsfrom original data set: training set(80%) and test set (20%). Train a ModelThis node builds a decision tree. Other Learnernodes train other models. Most Learner nodesoutput a PMML model (blue square output port). Apply the ModelPredictor nodes apply aspecific model to a data setand append the modelpredictions. Score the ModelCompute a confusion matrixbetween real and predictedclass values and calculate therelated accuracy measures. Visualize Create interactive scatter plot. Descriptive Statistics Calculate the statistical properties ofthe data set attributes. Interactive TableDisplay table of the entire data training set test set Try this:KNIME's Interactive Visualizations: 1) Execute the workflow2) Open the Scorer node view3) Hilite a cell in the confusion matrix4) Open the Interactive Table view5) Select "Hilite"->"Filter"->"Show Hilited Only" This shows only the misclassified data rows. Reading adult.csvRed for income "<=50K"Blue for income ">50K"Apply decision tree modelto test setRandom drawing 80% upper port20% lower portConfusion matrixaccuracy measuresShow entire data as tableStats and exploratoryhistograms in ViewTrain to predictclass "income"Age vs. number-hourscolor-coded by income File Reader Color Manager Decision TreePredictor Partitioning Scorer InteractiveTable (local) Statistics DecisionTree Learner Scatter Plot Simple Model Training for ClassificationThis workflow demonstrates how a simple classifier is built and applied to new data. It also illustrates the use of KNIME's hiliting capabilities, which allow interactive views to be connected within the same workflow.Task Predict the income group from demographic attributes of the adult data set (census data).Find more information on KNIME's Learning Hub at http://www.knime.org/learning-hub (tutorials, videos, white papers, many more workflows) Data ReadingRead the adult data set file. There isone row for each person, plusdemographic info and the incomegroup. The file is located in TheData/Basics/ Graphical PropertiesAssign colors by income group. Data PartitioningCreate two separate partitionsfrom original data set: training set(80%) and test set (20%). Train a ModelThis node builds a decision tree. Other Learnernodes train other models. Most Learner nodesoutput a PMML model (blue square output port). Apply the ModelPredictor nodes apply aspecific model to a data setand append the modelpredictions. Score the ModelCompute a confusion matrixbetween real and predictedclass values and calculate therelated accuracy measures. Visualize Create interactive scatter plot. Descriptive Statistics Calculate the statistical properties ofthe data set attributes. Interactive TableDisplay table of the entire data training set test set Try this:KNIME's Interactive Visualizations: 1) Execute the workflow2) Open the Scorer node view3) Hilite a cell in the confusion matrix4) Open the Interactive Table view5) Select "Hilite"->"Filter"->"Show Hilited Only" This shows only the misclassified data rows. Reading adult.csvRed for income "<=50K"Blue for income ">50K"Apply decision tree modelto test setRandom drawing 80% upper port20% lower portConfusion matrixaccuracy measuresShow entire data as tableStats and exploratoryhistograms in ViewTrain to predictclass "income"Age vs. number-hourscolor-coded by income File Reader Color Manager Decision TreePredictor Partitioning Scorer InteractiveTable (local) Statistics DecisionTree Learner Scatter Plot

Nodes

Extensions

Links