Icon

Building a Simple Classifier

Simple Model Training for Classification
Simple Model Training for ClassificationThis workflow demonstrates how a simple classifier is built and applied to new data. It also illustrates the use of KNIME's hiliting capabilities, which allow interactive views to beconnected within the same workflow.Task Predict the income group from demographic attributes of the adult data set (census data).Find more information on KNIME's Learning Hub at http://www.knime.org/learning-hub (tutorials, videos, white papers, many more workflows) Data ReadingRead the adult data set file.There is one row for eachperson, plus demographic infoand the income group. Thefile is located in TheData/Basics/ Graphical PropertiesAssign colors by incomegroup. Data PartitioningCreate two separatepartitions from originaldata set: training set(80%) and test set (20%). Train a ModelThis node builds a decision tree. OtherLearner nodes train other models. MostLearner nodes output a PMML model(blue square output port). Apply the ModelPredictor nodes applya specific model to adata set and appendthe model predictions. Score the ModelCompute a confusionmatrix between realand predicted classvalues and calculatethe related accuracymeasures. Visualize Create interactive scatter plot. Descriptive Statistics Calculate the statisticalproperties of the data setattributes. Interactive TableDisplay table of the entire data training set test set Try this:KNIME's Interactive Visualizations: 1) Execute the workflow2) Open the Scorer node view3) Hilite a cell in the confusion matrix4) Open the Interactive Table view5) Select "Hilite"->"Filter"->"Show Hilited Only" This shows only the misclassified data rows. Reading adult.csvRed for income "<=50K"Blue for income ">50K"Apply decision tree modelto test setRandom drawing 80% upper port20% lower portShow entire data as tableStats and exploratoryhistograms in ViewTrain to predictclass "income"Age vs. number-hourscolor-coded by incomeNode 12 File Reader Color Manager Decision TreePredictor Partitioning InteractiveTable (local) Statistics DecisionTree Learner Scatter Plot Scorer Simple Model Training for ClassificationThis workflow demonstrates how a simple classifier is built and applied to new data. It also illustrates the use of KNIME's hiliting capabilities, which allow interactive views to beconnected within the same workflow.Task Predict the income group from demographic attributes of the adult data set (census data).Find more information on KNIME's Learning Hub at http://www.knime.org/learning-hub (tutorials, videos, white papers, many more workflows) Data ReadingRead the adult data set file.There is one row for eachperson, plus demographic infoand the income group. Thefile is located in TheData/Basics/ Graphical PropertiesAssign colors by incomegroup. Data PartitioningCreate two separatepartitions from originaldata set: training set(80%) and test set (20%). Train a ModelThis node builds a decision tree. OtherLearner nodes train other models. MostLearner nodes output a PMML model(blue square output port). Apply the ModelPredictor nodes applya specific model to adata set and appendthe model predictions. Score the ModelCompute a confusionmatrix between realand predicted classvalues and calculatethe related accuracymeasures. Visualize Create interactive scatter plot. Descriptive Statistics Calculate the statisticalproperties of the data setattributes. Interactive TableDisplay table of the entire data training set test set Try this:KNIME's Interactive Visualizations: 1) Execute the workflow2) Open the Scorer node view3) Hilite a cell in the confusion matrix4) Open the Interactive Table view5) Select "Hilite"->"Filter"->"Show Hilited Only" This shows only the misclassified data rows. Reading adult.csvRed for income "<=50K"Blue for income ">50K"Apply decision tree modelto test setRandom drawing 80% upper port20% lower portShow entire data as tableStats and exploratoryhistograms in ViewTrain to predictclass "income"Age vs. number-hourscolor-coded by incomeNode 12 File Reader Color Manager Decision TreePredictor Partitioning InteractiveTable (local) Statistics DecisionTree Learner Scatter Plot Scorer

Nodes

Extensions

Links