Parameter Optimization Loop Start

This loop starts a parameter optimization loop. In the dialog you can enter several parameters with an interval and a step size. The loop will vary these parameters following a certain search strategy. Each parameter is output as a flow variable. The parameters can then be used inside the loop body either directly or by converting them with a Variable to Table node into a data table.
Currently four search strategies are available:

  • Brute Force: All possible parameter combination (given the intervals and the step sizes) are checked and the best is returned.
  • Hillclimbing: A random start combination is created and the direct neighbors (respecting the given intervals and step sizes) are evaluated. The best combination among the neighbors is the start point for the next iteration. If no neighbor improves the objective function the loop terminates.
  • Random Search: Parameter combinations are randomly chosen and evaluated. The specified start and stop values define the parameter space from which a parameter combination is randomly drawn. Additionally, an optional step size can be defined to restrict the possible parameter values. The loop terminates after a specified number of iterations or, if early stopping is activated, when for a specified number of rounds the objective value has not improved. Note, that it is drawn with replacement. While duplicate parameter combinations will be processed just once, each of them still counts as an iteration. Due to that, it may happen that actually less loop iterations are processed than defined.
  • Bayesian Optimization (TPE): This strategy consists of two phases. The first one is the warm-up in which parameter combinations are randomly chosen and evaluated. Based on the scores of the warm-up rounds, the second phase tries to find promising parameter combinations which are then evaluated. The algorithm used by this strategy is based on an algorithm published by Bergstra et al. (see link further down) and uses Tree-structured Parzen Estimation (TPE) in the second phase to find good parameter combinations. The specified start and stop values define the parameter space from which a parameter combination is randomly drawn. Additionally, an optional step size can be defined to restrict the possible parameter values. The loop terminates after a specified number of iterations. Note, that it is drawn with replacement. While duplicate parameter combinations will be processed just once, each of them still counts as an iteration. Due to that, it may happen that actually less loop iterations are processed than defined.
See also for Bayesian optimization: Algorithms for Hyper-Parameter Optimization by Bergstra et al.

Options

Parameter name
The name of the parameter.
Start value
The interval start value (inclusive).
Stop value
The interval end value (inclusive).
Step size
The step size by which the value is increased after each iteration. Negative step sizes are possible, if from is greater than to
Integer?
Check this if the parameter should be an integer. Otherwise it is a real number.
Search strategy
Select the search strategy that should be used (see above).
Use random seed
Check this option and supply a seed if a defined seed for the random number generator should be used (only for search strategies that make use of random values). Defining a seed makes the results reproducible. If this option is not selected a new random seed will be selected each time the loop is re-run.
Enable step size
If enabled, a step size can be used to define a grid from which parameters are randomly sampled.
Max. number of iterations
Define the maximum number of iterations. The output table may contain less rows than the specified number if the same combination of parameters is randomly generated multiple times. Each combination appears just once in the output table. Furthermore, if early stopping is enabled, the iterations may be less than the specified number.
Early stopping
Check this if the search should stop early when the objective value does not improve for a specified number of rounds. This is based on a moving average whereby the size of the moving window is the same number as the specified number of rounds. If the ratio of improvement is lower than a specified tolerance, the search stops.
Number of rounds
The number of rounds used for early stopping which defines after how many trials without improvement (or with less improvement than the specified tolerance) the search stops. It also defines the size of the moving window.
Tolerance
The tolerance used for early stopping which defines the threshold for the ratio of improvement. If the ratio is lower than the threshold, the search stops.
Number of warm-up rounds
The number of warm-up rounds in which parameter combinations are randomly chosen and evaluated. After these rounds, the actual Bayesian optimization starts in which parameter combinations are chosen by looking at the past scores and guessing new good parameter combinations. If a duplicate parameter combination is drawn, an additional round will be made, i.e., if 20 warm-up rounds are specified, the first 20 rows in the output will belong to the warm-up. However, a duplicate parameter combination during the warm-up phase will count into the maximum number of iterations.
Gamma
The value of gamma is used by the TPE to divide already evaluated parameter combinations into good and bad based on their score. It defines the fraction of how many of the parameter combinations go to the good group and is often chosen to be in [0.15, 0.30]. A gamma value of 0.25 means that the 25% best parameter combinations will belong to the good distribution an the rest to the bad one. For both of these groups a probability density function is built using Parzen-window density estimation.
See the paper or one of many blogs for more information.
Number of candidates per round
The TPE tries to find the next parameter combination by maximizing the expected improvement. This is done by drawing randomly candidates from the parameter space. The number of candidates per round defines how many candidates are drawn. In order to maximize the expected improvement, we want to maximize the probability of the good distribution divided by the probability of the bad distribution of each candidate (the distributions are those described in the description of the gamma option). The candidate with the highest expected improvement will be the next parameter to evaluate.
The higher the number of candidates is chosen, the more the algorithm will exploit already good parameter spaces. A lower number will lead to more exploration.
See the paper or one of many blogs for more information.

Input Ports

This node has no input ports

Output Ports

Icon
A parameter combination as flow variables

Popular Predecessors

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.