Icon

Generating chemical fingerprints

This short workflow demonstrates how to generate chemical fingerpints using different methods. It uses nodes from the RDKit and the Vernalis extensions which are freely available. Both methods create hashed fingerprints. The resulting cardinality (how often 1 occurs in the bit string) can be used to calculate the fingerprint darkness. Increased bit string length results in reduced darkness.


This workflow snippet demonstrates the creation of chemical fingerprints from input molecules. There are a lot of nodes from different (license-bound) extensions available in KNIME which can generate fingerprints, but we're using here nodes fromthe RDKit and the Vernalis extensions that are both freely available. The Cardinality node is part of the Vernalis extension. There are several methods available to generate different kind of fingerprints. The number of bits that can be 1 is defined by the method used and candiffer between methods. Both fingerprint types used here (Morgan and RDKit) are hashed fingerprints. You can find links with more information on thesemethods in the workflow description (right side). The resulting Cardinality columns contain the number of occurences of 1 in each fringerprint bit string.Note these are different for each method. The higher the percentage of 1 in the bit string (i.e. how often 1 occurs in relation to the bit length), the "darker" the fingerprint. Increasing the bit length(using the same method) decreases fingerprint darkness, and therefore the probability of bit collisions. Bit collisions occur when the same bit is set bymultiple patterns (i.e. substructures within the molecule). Read in a file containingmolecules in differentformats Calculating fingerprints usingdifferent algorithms andnumber of bits Inspecting the different results.The appended cardinalitycolumn indicates of often "1"occurs within the bit string RDKit 1024 bitsRDKit 2048 bitsMorgan 1024 bitsTable Reader RDKit Fingerprint RDKit Fingerprint Cardinality Cardinality RDKit Fingerprint Cardinality This workflow snippet demonstrates the creation of chemical fingerprints from input molecules. There are a lot of nodes from different (license-bound) extensions available in KNIME which can generate fingerprints, but we're using here nodes fromthe RDKit and the Vernalis extensions that are both freely available. The Cardinality node is part of the Vernalis extension. There are several methods available to generate different kind of fingerprints. The number of bits that can be 1 is defined by the method used and candiffer between methods. Both fingerprint types used here (Morgan and RDKit) are hashed fingerprints. You can find links with more information on thesemethods in the workflow description (right side). The resulting Cardinality columns contain the number of occurences of 1 in each fringerprint bit string.Note these are different for each method. The higher the percentage of 1 in the bit string (i.e. how often 1 occurs in relation to the bit length), the "darker" the fingerprint. Increasing the bit length(using the same method) decreases fingerprint darkness, and therefore the probability of bit collisions. Bit collisions occur when the same bit is set bymultiple patterns (i.e. substructures within the molecule). Read in a file containingmolecules in differentformats Calculating fingerprints usingdifferent algorithms andnumber of bits Inspecting the different results.The appended cardinalitycolumn indicates of often "1"occurs within the bit string RDKit 1024 bitsRDKit 2048 bitsMorgan 1024 bitsTable Reader RDKit Fingerprint RDKit Fingerprint Cardinality Cardinality RDKit Fingerprint Cardinality

Nodes

Extensions

Links