Fast Tanimoto

EU-OPENSCREEN-Node extension for KNIME Workbench version 1.2.0.v202006241503 by Martin Neuenschwander

Compares molecule fingerprints and lists identifiers of molecules having scores larger than a preset threshold. The node is optimized for processing of larger sets of molecules (tested with 50'000 molecules, processing time is less than 5 minutes on a Core i5 with 3 Ghz). Prevents comparison of a molecule to itself


Molecule Identifier
String identifier of the molecule
Molecule Fingerprints
Fingerprint BitVector column as generated by the CDK/Fingerprint or RDKit/Fingerprint nodes
Tanimoto Threshold
Fraction of identical bits in both molecule fingerprints, range 0-1. Only molecules with a score above the threshold will be listed in the output columns

Input Ports

Table with string columns for molecule identifier and a BitVector fingerprint column

Output Ports

Input table with attached tanimoto.id (a copy of the string identifiers of the molecule compared), tanimoto.similars (list of identifiers of molecules with a tanimoto similarity higher than the threshold), tanimoto.coefficients (the corresponding tanimito similarities of those molecules), tanimoto.num.similars (the number of similar molecules found)

To use this node in KNIME, install HTS Data Mining from the following update site:


A zipped version of the software site can be downloaded here.

