Score Erosion

This node uses the Score Erosion algorithm in order to select subsets of items/rows that
  • have a high overall score, and
  • are as diverse as possible
It is essentially an iterative process that first selects the item with the highest score, reduces the scores of the remaining items based on their distance to the selected item, then selects the next item with the highest score, and so on. With the erosion factor you specify whether activity should be preferred over diversity or the other way around. Details about the algorithm are available in

Maximum-Score Diversity Selection for Early Drug Discovery , Journal of Chemical Information and Modeling, vol. 51, no. 2, pp. 237-247, 2011; Doi: 10.1021/ci100426r .

An example of how to use this node can be found on the EXAMPLES server.

Options

Number of rows to select
Enter the number of rows that should be selected here (the subset size).
Score column
Select the column containing the scores, and specify whether a low or a high score is preferred.
Distance column
Select the column containing the distances between the items here.
Erosion factor
Select a value for the erosion factor here. High values favor diverse subsets, low values favor more active subsets.
Score update mode
The difference mode subtracts the distance to the selected item from all scores, whereas the product mode multiplies the scores with the distance.

Input Ports

Icon
The input table, containing at least one numeric column with scores for each row, and one distance column.

Output Ports

Icon
A table containing the selected rows together with their eroded scores.
Icon
A table containing information about the overall activity and diversity of the selected subset in each internal iteration.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.