Icon

modelling_​B_​ollist_​final

Prepare the Dataset for Modeling

This section loads the CSV data, keeps only the relevant columns, and creates a new target/class column using rules. It then removes no-longer-needed fields, tidies a column name, converts numeric values to text categories where needed, and recalculates the domain metadata so later nodes can correctly recognize the possible values in each column. In short, it turns the raw file into a cleaner, model-ready table.

Train and Compare Models with Cross Validation

This section runs a cross-validation loop: the data is repeatedly split into training and test portions, two different models are trained (Random Forest and Logistic Regression), and both make predictions on the same test rows. Their prediction outputs are then combined side by side so each fold keeps both models’ results together. Finally, the loop aggregates all folds into one full prediction table and an error summary, giving an overall view of model performance across the entire dataset.

Evaluate and Visualize Model Performance

Uses the combined cross-validation prediction results to assess how well the models classify the target. The scorer nodes calculate accuracy-style results such as the confusion matrix and summary statistics, while the ROC curve views show how well each model separates the two classes across different decision thresholds. In short, this block turns predictions into performance metrics and comparison visuals.

Convert Probabilities into Final Class Labels

This step turns model output into a final yes/no prediction. First, rules are used to map prediction scores or probabilities into a clear predicted class. Then the results are evaluated with a confusion matrix and accuracy statistics so you can see how well those final classifications match the true target values.

CSV Reader
Column Filter
Rule Engine
Rule Engine
Column Filter
Column Renamer
Number to String
Domain Calculator
Scorer
Random Forest Predictor
X-Partitioner
Logistic Regression Predictor
Random Forest Learner
Column Appender
X-Aggregator
Logistic Regression Learner
LR
ROC Curve
LR
Scorer
RF
ROC Curve
RF
Scorer

Nodes

Extensions

Links