Icon

Classification

Decision Tree Model:

Accuracy: 82,2%

Cohen's kappa: 50%

Random Forest Model:

Accuracy: 87%

Cohen's kappa: 61,3%

Used for Feature Selection. Importance of features calculated by # of splits they were used in (Information gain of the features)

Best performing Model

Tried optimizing with more Models (200 instead of 100): Only lead to a slight increase of Cohen's kappa to 61.5%, Accuracy stayed the same

Logistic Regression Model:

Accuracy: 85,2%

Cohen's kappa: 56,5%

Removed the 2 features with by far the least importance. Accuracy stayed the same. Removing the 3rd least important feature (hour per week) reduced the Accuracy by 0.2% to 85%

kNN Model: k=3, 5, 7

Accuracy: 82.3, 82.8, 83.1%

Cohen's kappa: 48, 49.3, 50%

Used One-Hot Encoding to convert strings to ints.

CSV Reader
X-Partitioner
Logistic Regression Predictor
Decision Tree Learner
Decision Tree Predictor
X-Aggregator
Scorer
ROC Curve
Normalizer
X-Aggregator
X-Partitioner
K Nearest Neighbor
Scorer
ROC Curve
convert string to text(One-Hot Encoding)
One to Many
X-Aggregator
Scorer
ROC Curve
Random Forest Learner
Column Filter
Replace missing valueswith mean/mode
Missing Value
Scorer
X-Partitioner
ROC Curve
Random Forest Predictor
Statistics
X-Aggregator
Box Plot
X-Partitioner
Logistic Regression Learner
Feature Selection bycalculating the importanceof the different features
Math Formula
Box Plot
Feature Reduction(remove fnlwgt, gender)
Column Filter
Bar Chart

Nodes

Extensions

Links