Icon

TeachOpenCADD_​Workflow1_​Data_​acquisition_​ChEMBL

TeachOpenCADD Workflow 1: Data acquisition from ChEMBL

Information on compound structure, bioactivity, and associated targets are organized in databases such as ChEMBL, PubChem, or DrugBank.
This workflow shows how to obtain and preprocess data for a query target (default target: EGFR) from the ChEMBL web services.

Step 6Convert bioactivityto pIC50 Step 1Download ChEMBL bioactivity data 1. Data acquisition from ChEMBLInformation on compound structure, bioactivity, and associated targetsare organized in databases such as ChEMBL, PubChem, or DrugBank.The following steps show how to obtain data for a query target (defaulttarget: EGFR). Step 2Get bioactivity (IC50) &compound (SMILES) data fromChEMBL ID Note that the database query can be slow.*Alternatively, the query results can be loaded from file with the "Table Reader" node which needs to beconnected to the "Column Resorter" node instead of the "Get activities" - "Column Resorter" connection.This workflow adapts the KNIME workflow example 50_Applications/30_RESTful_ChEMBL/03_ChEMBL_Bioactivity_Search (KNIME EXAMPLES Server, accessed: 2019-05-18). Step 4Filter for entries with IC50 inmolar units with exactmeasurements Step 5Convert all molar units to nM Step 3Filter out entries with missingvalues & duplicates This workflow is part of the TeachOpenCADD pipeline: https://hub.knime.com/volkamerlab/space/TeachOpenCADDRead more on the theoretical background of this workflow on our TeachOpenCADD platform: https://projects.volkamerlab.org/teachopencadd/talktorials/T001_query_chembl.html Ca. 10s per 1000 molsOnly IC50Only binding assay dataNo duplicatesOnly molar unitsOnly exactmeasurementsAdd pIC50Only SMILES presentConvert all molar units to nMOnly nMSave compound listLoad results*Input targetChEMBL ID Table View Get activitiescount Get activities Row Filter Row Filter GroupBy Row Filter Row Filter Column Resorter Column Filter Column Rename Column Resorter Math Formula Row Filter Java Snippet Row Filter CSV Writer Table Reader String To Number Step 6Convert bioactivityto pIC50 Step 1Download ChEMBL bioactivity data 1. Data acquisition from ChEMBLInformation on compound structure, bioactivity, and associated targetsare organized in databases such as ChEMBL, PubChem, or DrugBank.The following steps show how to obtain data for a query target (defaulttarget: EGFR). Step 2Get bioactivity (IC50) &compound (SMILES) data fromChEMBL ID Note that the database query can be slow.*Alternatively, the query results can be loaded from file with the "Table Reader" node which needs to beconnected to the "Column Resorter" node instead of the "Get activities" - "Column Resorter" connection.This workflow adapts the KNIME workflow example 50_Applications/30_RESTful_ChEMBL/03_ChEMBL_Bioactivity_Search (KNIME EXAMPLES Server, accessed: 2019-05-18). Step 4Filter for entries with IC50 inmolar units with exactmeasurements Step 5Convert all molar units to nM Step 3Filter out entries with missingvalues & duplicates This workflow is part of the TeachOpenCADD pipeline: https://hub.knime.com/volkamerlab/space/TeachOpenCADDRead more on the theoretical background of this workflow on our TeachOpenCADD platform: https://projects.volkamerlab.org/teachopencadd/talktorials/T001_query_chembl.html Ca. 10s per 1000 molsOnly IC50Only binding assay dataNo duplicatesOnly molar unitsOnly exactmeasurementsAdd pIC50Only SMILES presentConvert all molar units to nMOnly nMSave compound listLoad results*Input targetChEMBL ID Table View Get activitiescount Get activities Row Filter Row Filter GroupBy Row Filter Row Filter Column Resorter Column Filter Column Rename Column Resorter Math Formula Row Filter Java Snippet Row Filter CSV Writer Table Reader String To Number

Nodes

Extensions

Links