Icon

mTOR Project

TeachOpenCADD Workflow 1: Data acquisition from ChEMBL

Information on compound structure, bioactivity, and associated targets are organized in databases such as ChEMBL, PubChem, or DrugBank.
This workflow shows how to obtain and preprocess data for a query target (default target: EGFR) from the ChEMBL web services.

Scaffold Base

Data Acquisition

Scaffold Base Training Model

Data Preprocessing & Cleaning

Data Transformation

Activity Base

Scaffold + Activity Base

Scaffold Base Test Model

Activity Base Training Model

Scaffold + Activity Base Training Model

Activity Base Test Model

Scaffold + Activity Base Test Model

RProp MLP Learner
SVM Learner
ROC Curve (legacy)
SVM Predictor
Test Set
Excel Reader
Column Filter
Constant Value Column Appender
Training Set
Excel Reader
Column Filter
Scorer
Column Resorter
Scorer
Concatenate
Parameter Optimization Loop Start
Random Forest Predictor
Constant Value Column Appender
Scorer
Constant Value Column Appender
Excel Writer
Constant Value Column Appender
Constant Value Column Appender
Parameter Optimization Loop End
Excel Reader
Validate model
Random Forest Predictor
Excel Reader
Column Filter
RF
Random Forest Learner
Joiner
Table Partitioner
Parameter Optimization Loop Start
Row Filter
Column Filter
Column Filter
RDKit Find Murcko Scaffolds
MLP
RProp MLP Learner
Row Filter
Parameter Optimization Loop End
Molecule Type Cast
Training Set
Excel Reader
RDKit Find Murcko Scaffolds
Gradient Boosted Trees Predictor
GroupBy
Column Filter
GroupBy
GBT
Gradient Boosted Trees Learner
Sorter
Sorter
Scorer
Table View
Parameter Optimization Loop Start
Model Writer
Parameter Optimization Loop End
Table View
Excel Reader
Missing Value
Model Reader
Row Filter
Random Forest Predictor
Parameter Optimization Loop End
Molecule Type Cast
String to Number
Excel Reader
RDKit Find Murcko Scaffolds
Parameter Optimization Loop Start
Scorer
Table Row to Variable
Input target ChEMBL ID
Column Filter
Excel Reader
RDKit Substructure Filter
Column Filter
Scorer
Random Forest Predictor
String Splitter (Regex)
Constant Value Column Appender
Model Reader
Add pIC50
Math Formula
Parameter Optimization Loop Start
Row Filter
Missing Value
Constant Value Column Appender
RDKit Fingerprint
Constant Value Column Appender
Row Filter
Rule Engine
Validate model
Random Forest Predictor
String Manipulation
Duplicate Row Filter
Parameter Optimization Loop End
String Manipulation
Scorer
RF
Random Forest Learner
String Splitter (Regex)
Joiner
Concatenate
Row Filter
Missing Value
Scorer
RDKit Find Murcko Scaffolds
Parameter Optimization Loop Start
Molecule Type Cast
Duplicate Row Filter
Parameter Optimization Loop End
RDKit Substructure Filter
Column Filter
SVM Predictor
Table Row to Variable
GroupBy
Row Filter
MLP
RProp MLP Learner
Constant Value Column Appender
MultiLayerPerceptron Predictor
Table View
Parameter Optimization Loop Start
Row Filter
Parameter Optimization Loop End
SVM
SVM Learner
RDKit Substructure Filter
Row Filter
Gradient Boosted Trees Predictor
Table Row to Variable
Table View
Row Filter
GBT
Gradient Boosted Trees Learner
GroupBy
Scorer
Row Filter
Column Filter
Excel Writer
Training Set
Excel Reader
GBT
Gradient Boosted Trees Learner
String to Number
Column Filter
Row Filter
Scorer
Column Filter
Column Filter
Gradient Boosted Trees Predictor
Math Formula
Excel Reader
Parameter Optimization Loop Start
Column Resorter
Column Renamer
Concatenate
Parameter Optimization Loop End
String Manipulation
Scorer
Gradient Boosted Trees Predictor
Table Partitioner
Gradient Boosted Trees Learner
Rule Engine
MultiLayerPerceptron Predictor
Gradient Boosted Trees Predictor
Rule Engine
Constant Value Column Appender
Parameter Optimization Loop End
ROC Curve (legacy)
Row Splitter
ROC Curve (legacy)
Parameter Optimization Loop Start
Scorer
Row Filter
Row Filter
Molecule Type Cast
Constant Value Column Appender
Scorer
Excel Writer
Scorer
Scorer
Scorer
RDKit From Molecule
Gradient Boosted Trees Learner
Scorer
RDKit Fingerprint
Parameter Optimization Loop Start
Constant Value Column Appender
Excel Writer
Joiner
Expand Bit Vector
ROC Curve (legacy)
MLP
RProp MLP Learner
Excel Writer
ROC Curve (legacy)
Parameter Optimization Loop End
Excel Writer
Random Forest Learner
Parameter Optimization Loop End
Training Set
Excel Reader
Constant Value Column Appender
Parameter Optimization Loop Start
Excel Writer
Constant Value Column Appender
Random Forest Predictor
Validation Set
Excel Reader
SVM Learner
Column Filter
Constant Value Column Appender
Excel Writer
Scorer
Column Filter
RProp MLP Learner
Excel Writer
ROC Curve (legacy)
SVM Predictor
MultiLayerPerceptron Predictor
Excel Writer
Test Set
Excel Reader
Excel Writer
Column Filter
Scorer
Excel Writer
Constant Value Column Appender
Scorer
Excel Writer
Training Set
Excel Reader
Random Forest Learner
Constant Value Column Appender
Concatenate
ROC Curve (legacy)
Concatenate
Constant Value Column Appender
Column Filter
RProp MLP Learner
Concatenate
Scorer
ROC Curve (legacy)
Constant Value Column Appender
Constant Value Column Appender
Column Resorter
ROC Curve (legacy)
Constant Value Column Appender
SVM Learner
Joiner
SVM Predictor
Random Forest Predictor
Constant Value Column Appender
Row Splitter
Column Filter
Row Splitter
Constant Value Column Appender
GroupBy
Column Filter
Column Filter
Gradient Boosted Trees Predictor
RDKit Find Murcko Scaffolds
Constant Value Column Appender
Row Splitter
Concatenate
Validation set
Reference Row Splitter
Training Set
Excel Reader
Constant Value Column Appender
Training set
Reference Row Splitter
Column Filter
Scorer
Column Filter
Row Filter
Test set
Reference Row Splitter
Test Set
Excel Reader
MultiLayerPerceptron Predictor
Constant Value Column Appender
Constant Value Column Appender
ROC Curve (legacy)
Constant Value Column Appender
Concatenate
Constant Value Column Appender
Row Filter
Column Filter
Constant Value Column Appender
Table Partitioner
Constant Value Column Appender
Constant Value Column Appender
Constant Value Column Appender
Duplicate Row Filter
Column Filter
Joiner
Missing Value
Column Resorter
Concatenate
Molecule Type Cast
Parameter Optimization Loop Start
SVM
SVM Learner
SVM Predictor
Concatenate
Constant Value Column Appender
Scorer
MultiLayerPerceptron Predictor
Parameter Optimization Loop End
Table Partitioner
RDKit Fingerprint
Expand Bit Vector
Column Filter
Joiner
Molecule Type Cast
Expand Bit Vector
Column Filter
RF
Random Forest Learner
Column Filter
Validate model
Random Forest Predictor
Duplicate Row Filter
Table View (JavaScript) (legacy)
Table Partitioner
SVM Predictor
SVM
SVM Learner
Get activities count
Scorer
Ca. 10s per 1000 mols
Get activities
Gradient Boosted Trees Learner
Constant Value Column Appender
Constant Value Column Appender
Scorer
Joiner
ROC Curve (legacy)
ROC Curve (legacy)
Random Forest Learner
Constant Value Column Appender
Excel Writer
Table Partitioner
Column Resorter
RDKit Find Murcko Scaffolds
MultiLayerPerceptron Predictor
Constant Value Column Appender
Constant Value Column Appender

Nodes

Extensions

Links