Parameter Optimization

DISCLAIMER: This legacy component only works with random forest
and precise parameters names and ranges. If you want to train a different classification model, you can still use this component as a starting point to create your own component. Despite this we recommend to adopt the more flexible "Parameter Optimization (Table)" component (kni.me/c/A_91QC387NtvJ6g8).

---

This component optimizes the parameters of a random forest classification model that is applied to the input data. In the dialog, you can select the target column and the optimization strategy. Output will be a table with one row that contains the values of the optimized parameters. The overall accuracy is used as objective value and will also be in the output row.

By default, the component optimizes the "number of models" and "maximum tree depth" parameters of a Random Forest model. You can change the model and the parameters to optimize. Instructions are given inside the component.

To train a model on the complete dataset, use a Random Forest Learner node, or the Learner node of any other model for which you optimized the parameters, and configure it using the best parameters. This model can then be deployed.

Options

Target Column
Select the target column. Only columns with nominal data can be selected.
Seed
A seed is used to get reproducible results. The results may vary for different seeds.
Parameter Optimization Strategy
Select the search strategy that should be used. There are four different strategies to choose from:%%00010- Random Search: Hyperparameter combinations are randomly sampled.%%00010- Bayesian Optimization (TPE): Tree-structured Parzen Estimators are used to learn which hyperparameter combinations are likely to improve the model’s performance.%%00010- Brute Force: All possible hyperparameter combinations are evaluated. This strategy is also called "Grid Search".%%00010- Hillclimbing: A random start combination is created and the direct neighbors are evaluated. The best combination among the neighbors is the start point for the next iteration. If no neighbor improves the model's performance, the loop terminates.

Input Ports

Icon
A table that contains at least one nominal column that can be used as target and one or several additional feature columns.

Output Ports

Icon
A table that contains one row with the best parameters found during the optimization process and the corresponding accuracy.

Nodes

Extensions

Links