Icon

AutoML

Options

Enable One Hot Encoding of String Columns:
By checking this box all columns of Domain "String", that is categorical features, are one hot encoded. The resulting Double columns are going to replace all String columns during training. DISCLAIMER: For Neural Network and Deep Learning (Keras) models this setting is necessary if you are providing only String columns.
Activate Interactive View:
If selected the Component creates an interactive view to browse the models ranked by the selected metric.
Feature Column Selection:
Select the columns which the model should use as input features during training. Excluded columns are discarded and won't be used at all in the workflow. Domain accepted: Number (Integer), Number (double), Number (long) and String.
Target Column:
Select which String column you want to predict.%%00010
Number of Folds in Cross Validation:
A k-fold cross validation takes place in the various parameter optimization phases. Insert the number of folds here.
Size of Training Set Partition (%):
Enter the size of the train set in percentage (%) to define the number of rows that will be used to train the models. The Test set partition is defined by the remaining rows (100% - defined value). Stratified sampling on the target class is performed.
Maximum Amount of Unique Values in a Categorical Column:
Categorical columns with more than this amount of unique values will be removed. This setting ensures you are not starting an endless training process because you forgot to remove columns such RowIDs.
Models to Train:
Select which machine learning algorithms should be used in the AutoML process. The H2O AutoML is going to train even more models types and ensembles: if selected your machine might become slow for a maximum of 2 minutes.
Metric for Auto Selection:
Select performance metric that should be used to automatically select the best model and tune the hyperparameters.
Output Settings:
Select the output format of the captured workflow created by the Component. By "features" we mean the columns selected by the user in the component configuration under "Feature Column Selection". By “prepared” we mean features processed from raw format to the format required by the model or the user. Any extra and unexpected column not recognized as a feature, such as an additional label or identifier, can still be provided to the captured workflow and it will be kept at its output no matter what you select here.

Input Ports

This node has no input ports

Output Ports

This node has no output ports

Nodes

Extensions

Links