Icon

Adult_​Classification

<h3><strong>Different model results</strong></h3> <h3><strong>Decision Tree</strong></h3><h3><strong>Overall accuracy: 85.88%</strong></h3> <h3><strong>Random forest</strong></h3><h3><strong>Overall accuracy:86.52%</strong></h3> <h3><strong>Logistic Regression Learner</strong></h3><h3><strong>Overall accuracy85.09%</strong></h3> <h3><strong>Cross-validation Decision Tree Learner</strong></h3><h3><strong>Overall accuracy: 84.75%</strong></h3> <h3><strong>k-Nearest-Neighbor</strong></h3><h3><strong>Overall accuracy: 78.81%</strong></h3> <h3><strong>Naive Bayes</strong></h3><h3><strong>Overallaccuracy: 83.24%</strong></h3> <h3><strong>Different model results with Cross validation</strong></h3> <h3><strong>Cross-validation Random forest</strong></h3><h3><strong>Overall accuracy: 86.15%</strong></h3> <ol><li><p><s>Conduct data pre-processing and feature selection.</s></p></li><li><p><s>Try at least three different machine learning classifiers.</s></p></li><li><p><s>Compare the classifiers by using a 10-fold cross-validation based on two different criteria (e.g. accuracy and AUC for theclassification task).</s></p></li><li><p><s>Choose one of the machine learning classifiers from sub-task 2 that you think is best. Optimize one of theparameters of the machine learning classifier based on the criteria in sub-task 3.</s></p></li><li><p><s>With the selected machine learning classifierand optimized parameters, train the classifier with the chosen parameters on the entire dataset and save the predictor model.</s></p></li></ol> <h3><strong>Cross-validation Logistic Regression Learner</strong></h3><h3><strong>Overall accuracy: 84.71%</strong></h3> <h3><strong>Efter Cross-validation Random forest </strong></h3><h3><strong>Overall accuracy: 86.31%</strong></h3> <h3><strong>Optimerade parametrar + tränad på hela datasetet</strong></h3> <h3><strong>Test av cross-validation med två partitioners och Random forest</strong></h3><h3><strong>Overall accuracy: 84.77-86.06%</strong></h3> <h3><strong>XGBoost Tree Ensemble Learner</strong></h3><h3><strong>Overall accuracy: 87.01%</strong></h3> <h3><strong>XGBoost Linear Model Learner</strong></h3><h3><strong>Overall accuracy: 84.55%</strong></h3> Node 63Node 64Node 65Node 66Node 67Node 68Node 69Node 79Node 80Node 81Node 82Node 83Node 84Node 90Node 91Node 92Node 93Node 94Node 95Node 98Node 99Node 10085% training dataNode 102Node 103Node 104Node 105Node 108Node 109Node 110Node 111Node 113Node 114Node 115Node 116Node 117Node 118Node 119Node 120Node 121Node 122Node 123Node 124Node 125Node 126Node 127Node 128Node 129Node 130Node 131Node 132Node 133Node 134Node 135Node 136Node 137Node 138Node 139Node 140Node 141Node 143Node 144Node 145Node 146Node 147Node 148Node 149Node 150Node 151Node 152Node 153Node 154Node 155Node 156Node 157Node 160Node 162Node 163Node 164Node 165Node 167Node 168Node 169Node 170Node 171Node 172Node 173Node 174Node 175Node 176Node 177Node 180Node 181Node 182Node 183Node 184Node 185Node 186Node 187Node 188Node 189Node 190Node 191Node 192Node 193Node 194Node 195Node 196Node 197Node 198Normalizer CSV Reader Partitioning LogisticRegression Learner Logistic RegressionPredictor Scorer (JavaScript) CSV Reader CSV Reader CSV Reader X-Partitioner DecisionTree Learner Decision TreePredictor X-Aggregator DecisionTree Learner Partitioning Scorer (JavaScript) Decision TreePredictor Rule-basedRow Filter Missing Value Scorer (JavaScript) Random ForestLearner Random ForestPredictor Partitioning Rule-basedRow Filter Missing Value Missing Value Rule-basedRow Filter Scorer (JavaScript) Missing Value Rule-basedRow Filter CSV Reader Partitioning K Nearest Neighbor Normalizer Normalizer (Apply) Scorer (JavaScript) Partitioning Missing Value CSV Reader Rule-basedRow Filter Naive Bayes Learner Naive BayesPredictor Scorer (JavaScript) Rule-basedRow Filter Missing Value Random ForestLearner Scorer (JavaScript) CSV Reader Random ForestPredictor X-Partitioner X-Aggregator Missing Value Rule-basedRow Filter Column Filter Column Filter Missing Value Rule-basedRow Filter CSV Reader Normalizer LogisticRegression Learner Scorer (JavaScript) Logistic RegressionPredictor X-Partitioner Column Filter X-Aggregator MISSING ROC Curve MISSING ROC Curve Missing Value Random ForestLearner MISSING ROC Curve Rule-basedRow Filter Column Filter Scorer (JavaScript) CSV Reader Random ForestPredictor Partitioning Column Filter Rule-basedRow Filter CSV Reader Scorer (JavaScript) Missing Value Partitioning MISSING ROC Curve XGBoost TreeEnsemble Learner XGBoost Predictor XGBoost LinearModel Learner Scorer (JavaScript) CSV Reader Missing Value MISSING ROC Curve Partitioning Rule-basedRow Filter Column Filter XGBoost Predictor Column Filter Rule-basedRow Filter CSV Reader Missing Value Partitioning X-Aggregator X-Partitioner Random ForestLearner MISSING ROC Curve Scorer (JavaScript) Random ForestPredictor Scorer (JavaScript) Random ForestPredictor MISSING ROC Curve Random ForestLearner Partitioning <h3><strong>Different model results</strong></h3> <h3><strong>Decision Tree</strong></h3><h3><strong>Overall accuracy: 85.88%</strong></h3> <h3><strong>Random forest</strong></h3><h3><strong>Overall accuracy:86.52%</strong></h3> <h3><strong>Logistic Regression Learner</strong></h3><h3><strong>Overall accuracy85.09%</strong></h3> <h3><strong>Cross-validation Decision Tree Learner</strong></h3><h3><strong>Overall accuracy: 84.75%</strong></h3> <h3><strong>k-Nearest-Neighbor</strong></h3><h3><strong>Overall accuracy: 78.81%</strong></h3> <h3><strong>Naive Bayes</strong></h3><h3><strong>Overallaccuracy: 83.24%</strong></h3> <h3><strong>Different model results with Cross validation</strong></h3> <h3><strong>Cross-validation Random forest</strong></h3><h3><strong>Overall accuracy: 86.15%</strong></h3> <ol><li><p><s>Conduct data pre-processing and feature selection.</s></p></li><li><p><s>Try at least three different machine learning classifiers.</s></p></li><li><p><s>Compare the classifiers by using a 10-fold cross-validation based on two different criteria (e.g. accuracy and AUC for theclassification task).</s></p></li><li><p><s>Choose one of the machine learning classifiers from sub-task 2 that you think is best. Optimize one of theparameters of the machine learning classifier based on the criteria in sub-task 3.</s></p></li><li><p><s>With the selected machine learning classifierand optimized parameters, train the classifier with the chosen parameters on the entire dataset and save the predictor model.</s></p></li></ol> <h3><strong>Cross-validation Logistic Regression Learner</strong></h3><h3><strong>Overall accuracy: 84.71%</strong></h3> <h3><strong>Efter Cross-validation Random forest </strong></h3><h3><strong>Overall accuracy: 86.31%</strong></h3> <h3><strong>Optimerade parametrar + tränad på hela datasetet</strong></h3> <h3><strong>Test av cross-validation med två partitioners och Random forest</strong></h3><h3><strong>Overall accuracy: 84.77-86.06%</strong></h3> <h3><strong>XGBoost Tree Ensemble Learner</strong></h3><h3><strong>Overall accuracy: 87.01%</strong></h3> <h3><strong>XGBoost Linear Model Learner</strong></h3><h3><strong>Overall accuracy: 84.55%</strong></h3> Node 63Node 64Node 65Node 66Node 67Node 68Node 69Node 79Node 80Node 81Node 82Node 83Node 84Node 90Node 91Node 92Node 93Node 94Node 95Node 98Node 99Node 10085% training dataNode 102Node 103Node 104Node 105Node 108Node 109Node 110Node 111Node 113Node 114Node 115Node 116Node 117Node 118Node 119Node 120Node 121Node 122Node 123Node 124Node 125Node 126Node 127Node 128Node 129Node 130Node 131Node 132Node 133Node 134Node 135Node 136Node 137Node 138Node 139Node 140Node 141Node 143Node 144Node 145Node 146Node 147Node 148Node 149Node 150Node 151Node 152Node 153Node 154Node 155Node 156Node 157Node 160Node 162Node 163Node 164Node 165Node 167Node 168Node 169Node 170Node 171Node 172Node 173Node 174Node 175Node 176Node 177Node 180Node 181Node 182Node 183Node 184Node 185Node 186Node 187Node 188Node 189Node 190Node 191Node 192Node 193Node 194Node 195Node 196Node 197Node 198Normalizer CSV Reader Partitioning LogisticRegression Learner Logistic RegressionPredictor Scorer (JavaScript) CSV Reader CSV Reader CSV Reader X-Partitioner DecisionTree Learner Decision TreePredictor X-Aggregator DecisionTree Learner Partitioning Scorer (JavaScript) Decision TreePredictor Rule-basedRow Filter Missing Value Scorer (JavaScript) Random ForestLearner Random ForestPredictor Partitioning Rule-basedRow Filter Missing Value Missing Value Rule-basedRow Filter Scorer (JavaScript) Missing Value Rule-basedRow Filter CSV Reader Partitioning K Nearest Neighbor Normalizer Normalizer (Apply) Scorer (JavaScript) Partitioning Missing Value CSV Reader Rule-basedRow Filter Naive Bayes Learner Naive BayesPredictor Scorer (JavaScript) Rule-basedRow Filter Missing Value Random ForestLearner Scorer (JavaScript) CSV Reader Random ForestPredictor X-Partitioner X-Aggregator Missing Value Rule-basedRow Filter Column Filter Column Filter Missing Value Rule-basedRow Filter CSV Reader Normalizer LogisticRegression Learner Scorer (JavaScript) Logistic RegressionPredictor X-Partitioner Column Filter X-Aggregator MISSING ROC Curve MISSING ROC Curve Missing Value Random ForestLearner MISSING ROC Curve Rule-basedRow Filter Column Filter Scorer (JavaScript) CSV Reader Random ForestPredictor Partitioning Column Filter Rule-basedRow Filter CSV Reader Scorer (JavaScript) Missing Value Partitioning MISSING ROC Curve XGBoost TreeEnsemble Learner XGBoost Predictor XGBoost LinearModel Learner Scorer (JavaScript) CSV Reader Missing Value MISSING ROC Curve Partitioning Rule-basedRow Filter Column Filter XGBoost Predictor Column Filter Rule-basedRow Filter CSV Reader Missing Value Partitioning X-Aggregator X-Partitioner Random ForestLearner MISSING ROC Curve Scorer (JavaScript) Random ForestPredictor Scorer (JavaScript) Random ForestPredictor MISSING ROC Curve Random ForestLearner Partitioning

Nodes

Extensions

Links