KNIME_ML_Project

Data Strategy and Balancing <ul><li>Objective: Optimize the dataset for machine learning modeling through feature normalization, class balancing, and data partitioning.</li><li>Internal Nodes and Parameters:<ul><li>Normalizer: Applied the Min-Max (0-1) method to all numerical features to ensure equal weighting during model training.</li><li>SMOTE: Performed oversampling on the minority class "Biopsy" (K=5) to achieve a perfect 50/50 balance, resulting in 803 records per class.</li><li>Table Partitioner: Executed a 70% Training / 30% Test split using stratified sampling to maintain class proportions across both datasets.</li></ul></li><li>Assumptions and Missingness:<ul><li>The input data is assumed to be free of missing values, as imputation was finalized during the previous phase.</li><li>The target variable (Biopsy) was converted to String format to satisfy the algorithmic requirements of the SMOTE node.</li></ul></li></ul>

Data Strategy and Balancing

Objective: Optimize the dataset for machine learning modeling through feature normalization, class balancing, and data partitioning.
Internal Nodes and Parameters:
- Normalizer: Applied the Min-Max (0-1) method to all numerical features to ensure equal weighting during model training.
- SMOTE: Performed oversampling on the minority class "Biopsy" (K=5) to achieve a perfect 50/50 balance, resulting in 803 records per class.
- Table Partitioner: Executed a 70% Training / 30% Test split using stratified sampling to maintain class proportions across both datasets.
Assumptions and Missingness:
- The input data is assumed to be free of missing values, as imputation was finalized during the previous phase.
- The target variable (Biopsy) was converted to String format to satisfy the algorithmic requirements of the SMOTE node.

KNIME_​ML_​Project

Nodes

Extensions

Links

Download

KNIME_ML_Project