Icon

04. Data Mining

There has been no title set for this workflow's metadata.

There has been no description set for this workflow's metadata.

Activity II: Logistic RegressionTrain a model that predicts whether a wine is red or wine based on some chemical features.- Open the output table of the normalizer node to get familiar with the dataset.- Partition the wine data into training and test set (80% for training with Stratified Sampling on the column color).- Train a logistic regression model on the training set with Target Column = color.- Apply the model to the test set.- Evaluate the quality of the model with the Scorer (JavaScript) node or the Scorer node. Activity I: Decision Trees- Read the table Customer.table saved in the workflow group data in the KNIME Explorer.- Partition the customer data into training and test set (80% for training with Stratified Sampling on the columnTarget).- Train a decision tree on the training set with Class Column = Target and ReturnLastPrediction for the no true childstrategy.(Hint: Remember, you can change the no true child strategy in the second tab PMML Settings.)- Apply the model to the test set.- Evaluate the quality of the model with the Scorer (JavaScript) node or the Scorer node.- Try out different settings, e.g. with and without reduced error pruning, and see whether you can improve the modelaccuracy. z-normalize the data with0 mean and standarddeviation of 1Read the wine.tabledatasetCustomer.tableRelative 80%/20%Stratified SAmpling - TargetBest configurationso far - Min numbersrecords per node - 3PredictorOverall Acuracy - 86.42%Kappa - 0.728LearnerPredictorRelative 80%/20%Stratified SAmpling - ColorOverall Acuracy - 99.38%Kappa - 0.983 Normalizer (PMML) Table Reader(deprecated) Table Reader Partitioning DecisionTree Learner Decision TreePredictor Scorer (JavaScript) LogisticRegression Learner Logistic RegressionPredictor Partitioning Scorer (JavaScript) Activity II: Logistic RegressionTrain a model that predicts whether a wine is red or wine based on some chemical features.- Open the output table of the normalizer node to get familiar with the dataset.- Partition the wine data into training and test set (80% for training with Stratified Sampling on the column color).- Train a logistic regression model on the training set with Target Column = color.- Apply the model to the test set.- Evaluate the quality of the model with the Scorer (JavaScript) node or the Scorer node. Activity I: Decision Trees- Read the table Customer.table saved in the workflow group data in the KNIME Explorer.- Partition the customer data into training and test set (80% for training with Stratified Sampling on the columnTarget).- Train a decision tree on the training set with Class Column = Target and ReturnLastPrediction for the no true childstrategy.(Hint: Remember, you can change the no true child strategy in the second tab PMML Settings.)- Apply the model to the test set.- Evaluate the quality of the model with the Scorer (JavaScript) node or the Scorer node.- Try out different settings, e.g. with and without reduced error pruning, and see whether you can improve the modelaccuracy. z-normalize the data with0 mean and standarddeviation of 1Read the wine.tabledatasetCustomer.tableRelative 80%/20%Stratified SAmpling - TargetBest configurationso far - Min numbersrecords per node - 3PredictorOverall Acuracy - 86.42%Kappa - 0.728LearnerPredictorRelative 80%/20%Stratified SAmpling - ColorOverall Acuracy - 99.38%Kappa - 0.983Normalizer (PMML) Table Reader(deprecated) Table Reader Partitioning DecisionTree Learner Decision TreePredictor Scorer (JavaScript) LogisticRegression Learner Logistic RegressionPredictor Partitioning Scorer (JavaScript)

Nodes

Extensions

Links