Icon

Wine data classification - solution

Wine data classification exercise1. Reading the data set -Read the file "wine.table" with the Table Reader node2. Explore the data -Use the Color Manager node to assign colors to different classes of the target variable Type -Use the Data Explorer node to examine statsitics and distributions of the features -Use the Scatterplot node to plot various attributes3. Partitioning -Use the Partitioning node to split the data set into the training (70%) & testing (30%) data sets -Stratified sampling4. Normalization -For the training data, normalize numerical features to the range of [0,1] with the Normalizer node -Apply the normalization from the training data to the testing data with the Normalizer (Apply) node 5. Train and apply a decision tree classification model -Train a decision tree model with the Decision Tree Learner node. -Apply the trained model to the testing data with Decision Tree Predictor -Make sure to output class probabilities -Evaluate the model performance with the Scorer (JavaScript) node -Plot the ROC curve -Adjust parameters of the Decision Tree Learner to improve the classifier performance6. Train and apply a Naive Bayes classification model -Train a naive Bayes model with the Naive Bayes Learner node. -Apply the trained model to the testing data with Naive Bayes Predictor -Make sure to output class probabilities -Evaluate the model performance with the Scorer (JavaScript) node -Plot the ROC curve 7. kNN classification model -Apply the kNN classification model with the K Nearest Neighbor node -Set k=5. Use the training data set as the model. -Make sure to output class probabilities -Evaluate the model performance with the Scorer (JavaScript) node -Plot the ROC curve -Adjust parameters of the K Nearest Neighbor node to improve the classifier performance Wine data-Chemical properties of 178 wines are examined, resulting in 13 numerical features.-There are 3 different types of wines in this data set, described by the column Type.-Goal of this analysis to classify these wines based on their features.Source: https://archive.ics.uci.edu/ml/datasets/wine Reading thedata tableAssigning differentcolors to differentclasses of TypeConverting Typefrom integer tostringExamining dataTraining - 70%Testing - 30%Normalizing thetraining dataApplying thenormalization fromthe training data tothe testing dataTraining decisiontree modelApplying the trainedmodel to the testingdataModel evaluationPrediction based onkNN modelModel evaluationTraining naive BayesmodelApplyingthe trained modelModel evaluation Table Reader Color Manager Number To String Data Explorer Scatter Plot Partitioning Normalizer Normalizer (Apply) DecisionTree Learner Decision TreePredictor Scorer (JavaScript) ROC Curve K Nearest Neighbor Scorer (JavaScript) ROC Curve Naive Bayes Learner Naive BayesPredictor Scorer (JavaScript) ROC Curve Wine data classification exercise1. Reading the data set -Read the file "wine.table" with the Table Reader node2. Explore the data -Use the Color Manager node to assign colors to different classes of the target variable Type -Use the Data Explorer node to examine statsitics and distributions of the features -Use the Scatterplot node to plot various attributes3. Partitioning -Use the Partitioning node to split the data set into the training (70%) & testing (30%) data sets -Stratified sampling4. Normalization -For the training data, normalize numerical features to the range of [0,1] with the Normalizer node -Apply the normalization from the training data to the testing data with the Normalizer (Apply) node 5. Train and apply a decision tree classification model -Train a decision tree model with the Decision Tree Learner node. -Apply the trained model to the testing data with Decision Tree Predictor -Make sure to output class probabilities -Evaluate the model performance with the Scorer (JavaScript) node -Plot the ROC curve -Adjust parameters of the Decision Tree Learner to improve the classifier performance6. Train and apply a Naive Bayes classification model -Train a naive Bayes model with the Naive Bayes Learner node. -Apply the trained model to the testing data with Naive Bayes Predictor -Make sure to output class probabilities -Evaluate the model performance with the Scorer (JavaScript) node -Plot the ROC curve 7. kNN classification model -Apply the kNN classification model with the K Nearest Neighbor node -Set k=5. Use the training data set as the model. -Make sure to output class probabilities -Evaluate the model performance with the Scorer (JavaScript) node -Plot the ROC curve -Adjust parameters of the K Nearest Neighbor node to improve the classifier performance Wine data-Chemical properties of 178 wines are examined, resulting in 13 numerical features.-There are 3 different types of wines in this data set, described by the column Type.-Goal of this analysis to classify these wines based on their features.Source: https://archive.ics.uci.edu/ml/datasets/wine Reading thedata tableAssigning differentcolors to differentclasses of TypeConverting Typefrom integer tostringExamining dataTraining - 70%Testing - 30%Normalizing thetraining dataApplying thenormalization fromthe training data tothe testing dataTraining decisiontree modelApplying the trainedmodel to the testingdataModel evaluationPrediction based onkNN modelModel evaluationTraining naive BayesmodelApplyingthe trained modelModel evaluation Table Reader Color Manager Number To String Data Explorer Scatter Plot Partitioning Normalizer Normalizer (Apply) DecisionTree Learner Decision TreePredictor Scorer (JavaScript) ROC Curve K Nearest Neighbor Scorer (JavaScript) ROC Curve Naive Bayes Learner Naive BayesPredictor Scorer (JavaScript) ROC Curve

Nodes

Extensions

Links