Fast Tanimoto

Compares molecule fingerprints and lists identifiers of molecules having scores larger than a preset threshold. The node is optimized for processing of larger sets of molecules (tested with 50'000 molecules, processing time is less than 5 minutes on a Core i5 with 3 Ghz). Prevents comparison of a molecule to itself

Options

Molecule Identifier
String identifier of the molecule
Molecule Fingerprints
Fingerprint BitVector column as generated by the CDK/Fingerprint or RDKit/Fingerprint nodes
Tanimoto Threshold
Fraction of identical bits in both molecule fingerprints, range 0-1. Only molecules with a score above the threshold will be listed in the output columns

Input Ports

Icon
Table with string columns for molecule identifier and a BitVector fingerprint column

Output Ports

Icon
Input table with attached tanimoto.id (a copy of the string identifiers of the molecule compared), tanimoto.similars (list of identifiers of molecules with a tanimoto similarity higher than the threshold), tanimoto.coefficients (the corresponding tanimito similarities of those molecules), tanimoto.num.similars (the number of similar molecules found)

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.