Multiobjective Subset Selection (NSGA-II)

This node finds (near)optimal fixed-sized subsets of rows based one one or more criteria. It uses the NSGA-II algorithm to find an approximation of the set of non-dominated solutions, i.e. the Pareto front .
In the dialogue you can choose which column from the input table should be optimized in which way. Each column represents an objective which - together with a function on that column, like sum of all values in the selected set or average distance of all values - can either be minimized or maximized. By default each objective is maximized , thus if you want to minimize negate the objective.
The node runs until a certain number of individuals have been evaluated. You can also stop the search manually in the node's view.


Basic settings

Number of rows
Choose the number of rows the solution should contain
Output nondominated solutions only
If this option is checked, only nondominated solutions (the Pareto front approximation) are output. Otherwise all examined solutions are output (which can be quite a few).
Enable hiliting
If selected, the resulting solution at the output table can be hilit and the contained rows in the input table are hilit, too.
Compute hypervolume
Enables the computation of the hypervolume enclosed by the Pareto front approximations. You need to provide a reference point which is dominated by all solutions. The reference point must be entered in the text field, with the coordinates separated by spaces. The number and order of coordinates must match the number and order of the objectives. Hypervolume computation is very expensive if more than two objectives are used, therefore an approximation algorithm is used.


Column List
This list shows all available columns from the input table (and some static function). The columns can be used to together with a certain function to create an expression in the "Expression" field.
Flow Variable List
This list shows all available flow variables. They can be used similar to column names or as constants in an expression.
This list shows all available functions. A description of the function will appear on the right once it is selected. Note certain functions only work with certain column types, e.g. distance matrices or numeric columns.
In this field you can built an arithmetic expression that is used as one objective function.

GA settings

Population size
How many individuals are in one population evolved by the GA.
Mutation probability
The probability that a newly generated solution is mutated.
Maximum individuals created
The maximum number of individuals that will be evaluated before the node stops automatically.
Gene representation
Select a gene representation for subsets here.

Input Ports

Datatable with all rows to choose from during evaluating subsets

Output Ports

A list of non-dominated solutions found during the search, together with their objective values and the rows from the input table that are contained in the solution.
This table contains the evolution of the hypervolume of the Pareto front.
A list of non-dominated solutions found during the search, together with their objective values and the rows from the input table that are contained in the solution. This model can be used with the Rowset Filter node to filter out certain rows from the input table.


Pareto front
Show all non-dominated solutions found so far. This view is updated during the search. The search can be stopped from the view and the currently found solutions are available at the output port. If hypervolume computation is enabled the second tab will show the evolution of the hypervolume.




You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.