Icon

Dealing with class imbalance in KNIME

No class imbalance handling Undersampling classimbalance handling Oversampling - Duplicationclass imbalance handling Oversampling - Creationclass imbalance handling Read in data CC transactionsExplore dataRemove rowswith missing valuesif anyLearn to predict fraudwith training setApply model on test setCatching 81.37% of fraud casesSelect an equalamount of normaland fraud transactionsLearn to predict fraudwith training setApply model on test setCatching 92.16%of fraud casesApply model on test setLearn to predict fraudwith training setCatching 85.29%of fraud casesCreate syntheticdataLearn to predict fraudwith training setApply model on test setCatching 88.24%of fraud cases80/20Train/TestUsing a loop, duplicate fraud cases 291 timesFor the test data,How many more normal transactionsdo we have than fraud ones?=113729/390 = 291.6 CSV Reader Data Explorer Missing Value XGBoost TreeEnsemble Learner XGBoost Predictor Scorer Equal Size Sampling XGBoost TreeEnsemble Learner XGBoost Predictor Scorer XGBoost Predictor XGBoost TreeEnsemble Learner Scorer SMOTE XGBoost TreeEnsemble Learner XGBoost Predictor Scorer Partitioning Duplicatefraud cases GroupBy No class imbalance handling Undersampling classimbalance handling Oversampling - Duplicationclass imbalance handling Oversampling - Creationclass imbalance handling Read in data CC transactionsExplore dataRemove rowswith missing valuesif anyLearn to predict fraudwith training setApply model on test setCatching 81.37% of fraud casesSelect an equalamount of normaland fraud transactionsLearn to predict fraudwith training setApply model on test setCatching 92.16%of fraud casesApply model on test setLearn to predict fraudwith training setCatching 85.29%of fraud casesCreate syntheticdataLearn to predict fraudwith training setApply model on test setCatching 88.24%of fraud cases80/20Train/TestUsing a loop, duplicate fraud cases 291 timesFor the test data,How many more normal transactionsdo we have than fraud ones?=113729/390 = 291.6CSV Reader Data Explorer Missing Value XGBoost TreeEnsemble Learner XGBoost Predictor Scorer Equal Size Sampling XGBoost TreeEnsemble Learner XGBoost Predictor Scorer XGBoost Predictor XGBoost TreeEnsemble Learner Scorer SMOTE XGBoost TreeEnsemble Learner XGBoost Predictor Scorer Partitioning Duplicatefraud cases GroupBy

Nodes

Extensions

Links