IconRDKit Diversity Picker0 ×

RDKit Nodes for Knime version 3.4.0.v201807311105 by NIBR

Picks diverse rows from an input table based on tanimoto distance between fingerprints. The picking is done using the MaxMin algorithm (Ashton, M. et. al., Quant. Struct.-Act. Relat., 21 (2002), 598-604). The algorithm is quite fast, even for large datasets, but note that runtime increases rapidly with the number of rows to be picked.

Options

Molecule or fingerprint column (table 1)
The column containing the molecules or fingerprints to pick from. If molecules are selected their fingerprints will be calculated automatically with Morgan, Radius 2, 2048 bit length.
Molecule or fingerprint column to bias away from (table 2)
The column containing molecules or fingerprints to bias away from. This option has the effect of seeding the diversity pick: Molecules selected will be diverse with respect to these biasing molecules as well as each other. If molecules are provided as input their fingerprints will be calculated automatically based on input of table 1. If table 1 has fingerprints with unknown settings this calculation will fail. In this case please regenerate fingerprints in table 1 with the RDKit Fingerprint Node or select a compatible fingerprint column in table 2 instead of a molecule column.
Number to pick
Number of diverse rows to pick.
Random seed
Random number seed to use.

Input Ports

Table with either molecule or fingerprints for diversity picking
Table with either molecules or fingerprints to bias away from

Output Ports

The results of the diversity pick

Best Friends (Incoming)

Best Friends (Outgoing)

Workflows

Update Site

To use this node in KNIME, install RDKit Nodes for Knime from the following update site:

Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform.