Icon

02_​KNIME_​Workshop_​simple_​ml_​reset

Workshop example adopted from official KNIME "Simple Model Training for Classification" example
Workshop example adopted from official KNIME "Simple Model Training for Classification" example1. Run the flow - go through all data science steps2. Prepare for advanced data science - clear the flow from text, data exploration and visualisation3. Add models and choose the best performing one (AUC as primary metric) - note example performance and leave the best one a. adapt partitioning and preditor - stratified, seed, individual probabilities b. add xgboost and javascript views extensions4. Expand flow with feature optimisation loop - simple forward feature selection5. Expand flow with hyperparameter optimisation loop - simple stepwise (bruteforce) for one hyperparameter a. add optimisation extension6. Rebuild model and cross-validate drafted and final one. Data ReadingRead the adult data set file.There is one row for eachperson, plus demographic infoand the income group. Thefile is located in TheData/Basics/ Graphical PropertiesAssign colors by incomegroup. Data PartitioningCreate two separatepartitions from originaldata set: training set(80%) and test set (20%). Train a ModelThis node builds a decision tree. OtherLearner nodes train other models. MostLearner nodes output a PMML model(blue square output port). Apply the ModelPredictor nodes applya specific model to adata set and appendthe model predictions. Score the ModelCompute a confusionmatrix between realand predicted classvalues and calculatethe related accuracymeasures. Visualize Create interactive scatter plot. Descriptive Statistics Calculate the statisticalproperties of the data setattributes. Interactive TableDisplay table of the entire data training set test set Try this:KNIME's Interactive Visualizations: 1) Execute the workflow2) Open the Scorer node view3) Hilite a cell in the confusion matrix4) Open the Interactive Table view5) Select "Hilite"->"Filter"->"Show Hilited Only" This shows only the misclassified data rows. Workshop example adopted from official KNIME "Simple Model Training for Classification" example1. Run the flow - go through all data science steps2. Prepare for advanced data science - clear the flow from text, data exploration and visualisation3. Add models and choose the best performing one (AUC as primary metric) - note example performance and leave the best one a. adapt partitioning and preditor - stratified, seed, individual probabilities b. add xgboost and javascript views extensions4. Expand flow with feature optimisation loop - simple forward feature selection5. Expand flow with hyperparameter optimisation loop - simple stepwise (bruteforce) for one hyperparameter a. add optimisation extension6. Rebuild model and cross-validate drafted and final one. Workshop example adopted from official KNIME "Simple Model Training for Classification" example1. Run the flow - go through all data science steps2. Prepare for advanced data science - clear the flow from text, data exploration and visualisation3. Add models and choose the best performing one (AUC as primary metric) - note example performance and leave the best one a. adapt partitioning and preditor - stratified, seed, individual probabilities b. add xgboost and javascript views extensions4. Expand flow with feature optimisation loop - simple forward feature selection5. Expand flow with hyperparameter optimisation loop - simple stepwise (bruteforce) for one hyperparameter a. add optimisation extension6. Rebuild model and cross-validate drafted and final one. Workshop example adopted from official KNIME "Simple Model Training for Classification" example1. Run the flow - go through all data science steps2. Prepare for advanced data science - clear the flow from text, data exploration and visualisation3. Add models and choose the best performing one (AUC as primary metric) - note example performance and leave the best one a. adapt partitioning and preditor - stratified, seed, individual probabilities b. add xgboost and javascript views extensions4. Expand flow with feature optimisation loop - simple forward feature selection5. Expand flow with hyperparameter optimisation loop - simple stepwise (bruteforce) for one hyperparameter a. add optimisation extension6. Rebuild model and cross-validate drafted and final one. Workshop example adopted from official KNIME "Simple Model Training for Classification" example1. Run the flow - go through all data science steps2. Prepare for advanced data science - clear the flow from text, data exploration and visualisation3. Add models and choose the best performing one (AUC as primary metric) - note example performance and leave the best one a. adapt partitioning and preditor - stratified, seed, individual probabilities b. add xgboost and javascript views extensions4. Expand flow with feature optimisation loop - simple forward feature selection5. Expand flow with hyperparameter optimisation loop - simple stepwise (bruteforce) for one hyperparameter a. add optimisation extension6. Rebuild model and cross-validate drafted and final one. Workshop example adopted from official KNIME "Simple Model Training for Classification" example1. Run the flow - go through all data science steps2. Prepare for advanced data science - clear the flow from text, data exploration and visualisation3. Add models and choose the best performing one (AUC as primary metric) - note example performance and leave the best one a. adapt partitioning and preditor - stratified, seed, individual probabilities b. add xgboost and javascript views extensions4. Expand flow with feature optimisation loop - simple forward feature selection5. Expand flow with hyperparameter optimisation loop - simple stepwise (bruteforce) for one hyperparameter a. add optimisation extension6. Rebuild model and cross-validate drafted and final one. Reading adult.csvRed for income "<=50K"Blue for income ">50K"Apply decision tree modelto test setRandom drawing 80% upper port20% lower portConfusion matrixaccuracy measuresShow entire data as tableStats and exploratoryhistograms in ViewTrain to predictclass "income"Age vs. number-hourscolor-coded by incomeConfusion matrixaccuracy measuresTrain to predictclass "income"Apply decision tree modelto test setRandom drawing 80% upper port20% lower portReading adult.csvApply decision tree modelto test setReading adult.csvRandom drawing 80% upper port20% lower portTrain to predictclass "income"Node 26Node 27Node 28Node 29Node 30Node 31Node 34Node 35Node 36Node 37Node 40Node 41Node 42Node 43Node 49Reading adult.csvNode 51Random drawing 80% upper port20% lower portNode 57Node 58Node 59Node 60Node 61Node 64Node 65Node 67Node 68Node 70Reading adult.csvNode 73Node 76Node 77Random drawing 80% upper port20% lower portNode 79Node 80Node 81Reading adult.csvTrain to predictclass "income"Apply decision tree modelto test setNode 97Node 98Node 99Node 100Node 101Node 102Node 104Node 105Node 106 File Reader Color Manager Decision TreePredictor Partitioning Scorer InteractiveTable (local) Statistics DecisionTree Learner Scatter Plot Scorer DecisionTree Learner Decision TreePredictor Partitioning File Reader Decision TreePredictor File Reader Partitioning DecisionTree Learner Scorer (JavaScript) ROC Curve Gradient BoostedTrees Learner Gradient BoostedTrees Predictor Random ForestLearner Random ForestPredictor ROC Curve Scorer (JavaScript) ROC Curve Scorer (JavaScript) Column Appender Column Appender Column Appender ROC Curve Gradient BoostedTrees Learner File Reader ROC Curve Partitioning Scorer (JavaScript) Gradient BoostedTrees Predictor Feature SelectionLoop Start (1:1) Feature SelectionLoop End Feature SelectionFilter ROC Curve Table Rowto Variable Scorer (JavaScript) Gradient BoostedTrees Learner ROC Curve File Reader Table Rowto Variable Gradient BoostedTrees Predictor ROC Curve Partitioning Column Filter Parameter OptimizationLoop Start ParameterOptimization Loop End File Reader DecisionTree Learner Decision TreePredictor X-Partitioner X-Aggregator X-Partitioner X-Partitioner X-Aggregator X-Aggregator Scorer (JavaScript) Scorer (JavaScript) Scorer (JavaScript) Workshop example adopted from official KNIME "Simple Model Training for Classification" example1. Run the flow - go through all data science steps2. Prepare for advanced data science - clear the flow from text, data exploration and visualisation3. Add models and choose the best performing one (AUC as primary metric) - note example performance and leave the best one a. adapt partitioning and preditor - stratified, seed, individual probabilities b. add xgboost and javascript views extensions4. Expand flow with feature optimisation loop - simple forward feature selection5. Expand flow with hyperparameter optimisation loop - simple stepwise (bruteforce) for one hyperparameter a. add optimisation extension6. Rebuild model and cross-validate drafted and final one. Data ReadingRead the adult data set file.There is one row for eachperson, plus demographic infoand the income group. Thefile is located in TheData/Basics/ Graphical PropertiesAssign colors by incomegroup. Data PartitioningCreate two separatepartitions from originaldata set: training set(80%) and test set (20%). Train a ModelThis node builds a decision tree. OtherLearner nodes train other models. MostLearner nodes output a PMML model(blue square output port). Apply the ModelPredictor nodes applya specific model to adata set and appendthe model predictions. Score the ModelCompute a confusionmatrix between realand predicted classvalues and calculatethe related accuracymeasures. Visualize Create interactive scatter plot. Descriptive Statistics Calculate the statisticalproperties of the data setattributes. Interactive TableDisplay table of the entire data training set test set Try this:KNIME's Interactive Visualizations: 1) Execute the workflow2) Open the Scorer node view3) Hilite a cell in the confusion matrix4) Open the Interactive Table view5) Select "Hilite"->"Filter"->"Show Hilited Only" This shows only the misclassified data rows. Workshop example adopted from official KNIME "Simple Model Training for Classification" example1. Run the flow - go through all data science steps2. Prepare for advanced data science - clear the flow from text, data exploration and visualisation3. Add models and choose the best performing one (AUC as primary metric) - note example performance and leave the best one a. adapt partitioning and preditor - stratified, seed, individual probabilities b. add xgboost and javascript views extensions4. Expand flow with feature optimisation loop - simple forward feature selection5. Expand flow with hyperparameter optimisation loop - simple stepwise (bruteforce) for one hyperparameter a. add optimisation extension6. Rebuild model and cross-validate drafted and final one. Workshop example adopted from official KNIME "Simple Model Training for Classification" example1. Run the flow - go through all data science steps2. Prepare for advanced data science - clear the flow from text, data exploration and visualisation3. Add models and choose the best performing one (AUC as primary metric) - note example performance and leave the best one a. adapt partitioning and preditor - stratified, seed, individual probabilities b. add xgboost and javascript views extensions4. Expand flow with feature optimisation loop - simple forward feature selection5. Expand flow with hyperparameter optimisation loop - simple stepwise (bruteforce) for one hyperparameter a. add optimisation extension6. Rebuild model and cross-validate drafted and final one. Workshop example adopted from official KNIME "Simple Model Training for Classification" example1. Run the flow - go through all data science steps2. Prepare for advanced data science - clear the flow from text, data exploration and visualisation3. Add models and choose the best performing one (AUC as primary metric) - note example performance and leave the best one a. adapt partitioning and preditor - stratified, seed, individual probabilities b. add xgboost and javascript views extensions4. Expand flow with feature optimisation loop - simple forward feature selection5. Expand flow with hyperparameter optimisation loop - simple stepwise (bruteforce) for one hyperparameter a. add optimisation extension6. Rebuild model and cross-validate drafted and final one. Workshop example adopted from official KNIME "Simple Model Training for Classification" example1. Run the flow - go through all data science steps2. Prepare for advanced data science - clear the flow from text, data exploration and visualisation3. Add models and choose the best performing one (AUC as primary metric) - note example performance and leave the best one a. adapt partitioning and preditor - stratified, seed, individual probabilities b. add xgboost and javascript views extensions4. Expand flow with feature optimisation loop - simple forward feature selection5. Expand flow with hyperparameter optimisation loop - simple stepwise (bruteforce) for one hyperparameter a. add optimisation extension6. Rebuild model and cross-validate drafted and final one. Workshop example adopted from official KNIME "Simple Model Training for Classification" example1. Run the flow - go through all data science steps2. Prepare for advanced data science - clear the flow from text, data exploration and visualisation3. Add models and choose the best performing one (AUC as primary metric) - note example performance and leave the best one a. adapt partitioning and preditor - stratified, seed, individual probabilities b. add xgboost and javascript views extensions4. Expand flow with feature optimisation loop - simple forward feature selection5. Expand flow with hyperparameter optimisation loop - simple stepwise (bruteforce) for one hyperparameter a. add optimisation extension6. Rebuild model and cross-validate drafted and final one. Reading adult.csvRed for income "<=50K"Blue for income ">50K"Apply decision tree modelto test setRandom drawing 80% upper port20% lower portConfusion matrixaccuracy measuresShow entire data as tableStats and exploratoryhistograms in ViewTrain to predictclass "income"Age vs. number-hourscolor-coded by incomeConfusion matrixaccuracy measuresTrain to predictclass "income"Apply decision tree modelto test setRandom drawing 80% upper port20% lower portReading adult.csvApply decision tree modelto test setReading adult.csvRandom drawing 80% upper port20% lower portTrain to predictclass "income"Node 26Node 27Node 28Node 29Node 30Node 31Node 34Node 35Node 36Node 37Node 40Node 41Node 42Node 43Node 49Reading adult.csvNode 51Random drawing 80% upper port20% lower portNode 57Node 58Node 59Node 60Node 61Node 64Node 65Node 67Node 68Node 70Reading adult.csvNode 73Node 76Node 77Random drawing 80% upper port20% lower portNode 79Node 80Node 81Reading adult.csvTrain to predictclass "income"Apply decision tree modelto test setNode 97Node 98Node 99Node 100Node 101Node 102Node 104Node 105Node 106 File Reader Color Manager Decision TreePredictor Partitioning Scorer InteractiveTable (local) Statistics DecisionTree Learner Scatter Plot Scorer DecisionTree Learner Decision TreePredictor Partitioning File Reader Decision TreePredictor File Reader Partitioning DecisionTree Learner Scorer (JavaScript) ROC Curve Gradient BoostedTrees Learner Gradient BoostedTrees Predictor Random ForestLearner Random ForestPredictor ROC Curve Scorer (JavaScript) ROC Curve Scorer (JavaScript) Column Appender Column Appender Column Appender ROC Curve Gradient BoostedTrees Learner File Reader ROC Curve Partitioning Scorer (JavaScript) Gradient BoostedTrees Predictor Feature SelectionLoop Start (1:1) Feature SelectionLoop End Feature SelectionFilter ROC Curve Table Rowto Variable Scorer (JavaScript) Gradient BoostedTrees Learner ROC Curve File Reader Table Rowto Variable Gradient BoostedTrees Predictor ROC Curve Partitioning Column Filter Parameter OptimizationLoop Start ParameterOptimization Loop End File Reader DecisionTree Learner Decision TreePredictor X-Partitioner X-Aggregator X-Partitioner X-Partitioner X-Aggregator X-Aggregator Scorer (JavaScript) Scorer (JavaScript) Scorer (JavaScript)

Nodes

Extensions

Links