Icon

TeachOpenCADD

TeachOpenCADD - a teaching platform for computer-aided drug design using KNIME

The TeachOpenCADD KNIME (v1.0.5) pipeline consists of eight inter-connected workflows (W1-8) , each containing one topic in computer-aided drug design.
The pipeline is illustrated using the epidermal growth factor receptor (EGFR), but can easily be applied to other targets of interest. Topics include how to fetch, filter and analyze compound data associated with a query targe.





TeachOpenCADD: A teaching platform for computer-aided drug design using KNIME W1 W4 W2 W5 W6 W7 W3 W8 Dominique Sydow, Michele Wichmann, Jaime Rodríguez-Guerra, Daria Goldmann, Gregory Landrum, andAndrea Volkamer Get MCS for largest cluster in datasetDataset filtered & formatted bybioactivity and SMILES Note: Database query can be slow.Dataset filtered byunwanted substructuresDataset filtered by X-ray,resolution & presence of ligandNote: Target ChEMBL ID can only be changed inside the metanodeML classifier(applicable to new compounds)Tanimoto similarityPhys. chem. propertiesDataset filtered by Lipinski's rule of fiveDataset ranked bysimilarity to query compoundDataset clustered byFingerprint similarityNote: Clustering takes time!Compounds with PAINS/BrenkHighlighted MCS Compound setFiltered compound setScore viewGefitinibDiverse subset from clustersROC curveCompounds without PAINS/BrenkSimilarity to queryKeep only top 2000 molsNote: Remove this nodewhen running your own dataset 6. Maximum commonsubstructures 1. Data acquisitionfrom ChEMBL 3. Molecular filtering:unwanted substructures 8. Protein data acquisition:Protein Data Bank (PDB) 7. Ligand-based screening:machine learning EnrichmentPlotter (local) Box Plot Input targetChEMBL ID 2. Molecular filtering: ADMEand lead-likeness criteria 4. Ligand-based screening:compound similarity 5. Compoundclustering Table View Table View Table View Table View Evaluate model Query compound Table View Line Plot Evaluate model Table View Scatter plot Row Filter TeachOpenCADD: A teaching platform for computer-aided drug design using KNIME W1 W4 W2 W5 W6 W7 W3 W8 Dominique Sydow, Michele Wichmann, Jaime Rodríguez-Guerra, Daria Goldmann, Gregory Landrum, andAndrea Volkamer Get MCS for largest cluster in datasetDataset filtered & formatted bybioactivity and SMILES Note: Database query can be slow.Dataset filtered byunwanted substructuresDataset filtered by X-ray,resolution & presence of ligandNote: Target ChEMBL ID can only be changed inside the metanodeML classifier(applicable to new compounds)Tanimoto similarityPhys. chem. propertiesDataset filtered by Lipinski's rule of fiveDataset ranked bysimilarity to query compoundDataset clustered byFingerprint similarityNote: Clustering takes time!Compounds with PAINS/BrenkHighlighted MCS Compound setFiltered compound setScore viewGefitinibDiverse subset from clustersROC curveCompounds without PAINS/BrenkSimilarity to queryKeep only top 2000 molsNote: Remove this nodewhen running your own dataset 6. Maximum commonsubstructures 1. Data acquisitionfrom ChEMBL 3. Molecular filtering:unwanted substructures 8. Protein data acquisition:Protein Data Bank (PDB) 7. Ligand-based screening:machine learning EnrichmentPlotter (local) Box Plot Input targetChEMBL ID 2. Molecular filtering: ADMEand lead-likeness criteria 4. Ligand-based screening:compound similarity 5. Compoundclustering Table View Table View Table View Table View Evaluate model Query compound Table View Line Plot Evaluate model Table View Scatter plot Row Filter

Nodes

Extensions

Links