Icon

TeachOpenCADD_​Workflow4_​Similarity_​search

TeachOpenCADD Workflow 4: Ligand-based screening: Compound similarity

In virtual screening (VS), compounds similar to known ligands of a target under investigation often build the starting point for drug development. This approach follows the similar property principle stating that structurally similar compounds are more likely to exhibit similar biological activities (exceptions are so-called activity cliffs). For computational representation and processing, compound properties can be encoded in form of bit arrays, so-called molecular fingerprints, e.g. MACCS and Morgan fingerprints. Compound similarity can be assessed by measures such as the Tanimoto and Dice similarity.
This workflow shows how to use these encodings and comparison methods. VS is here conducted based on a similarity search.

4. Ligand-based screening: compound similarity

In virtual screening (VS), compounds similar to known ligands of a target under investigation often build the starting point for drug development. This approach follows the similar property principle stating that structurally similar compounds are more likely to exhibit similar biological activities (exceptions are so-called activity cliffs). For computational representation and processing, compound properties can be encoded in form of bit arrays, so-called molecular fingerprints, e.g. MACCS and Morgan fingerprints. Compound similarity can be assessed by measures such as the Tanimoto and Dice similarity. The following steps show how to use these encodings and comparison methods. VS is here conducted based on a similarity search.

Step 2
Similarity search of query compound (example: gefitinib)
against full dataset using Tanimoto/Dice similarity

Step 1
Calculate fingerprints for dataset and query compound

Step 3
Evaluate performance with enrichment plots (split dataset into active and inactive compounds at pIC50 = 6.3)

MACC fingerprints

Morgan fingerprints

This workflow is part of the TeachOpenCADD pipeline:
https://hub.knime.com/volkamerlab/space/TeachOpenCADD

Read more on the theoretical background of this workflow on our TeachOpenCADD platform:
https://projects.volkamerlab.org/teachopencadd/talktorials/T004_compound_similarity.html
Dataset
RDKit Fingerprint
Dataset
RDKit Fingerprint
Tanimoto similarity
Similarity Search
Dice similarity
Similarity Search
Tanimoto similarity
Similarity Search
Dice similarity
Similarity Search
Similarity to query
Scatter plot
Joiner
Tanimoto similarity
Enrichment Plotter (legacy)
List of compounds
CSV Reader
Joiner
Joiner
Query
RDKit Fingerprint
Molecule Type Cast
List of compounds
CSV Reader
Dice similarity
Enrichment Plotter (legacy)
Dataset
RDKit Fingerprint
Query
RDKit Fingerprint
RDKit From Molecule
Tanimoto similarity
Similarity Search
Dataset
RDKit Fingerprint
RDKit From Molecule
Smiles Reader
Molecule Type Cast
Similarity Matrix (from Molecules)
Gefitinib
Query compound
Column Renamer
Column Renamer
Column Renamer
Column Renamer

Nodes

Extensions

Links