Icon

TOC_​W7_​plus_​AutoML

TOC_W7_plus_AutoML
Step 1Split dataset intoactive & inactivecompounds (pIC50 cut-off = 6.3) Step 2Generate fingerprints and prepare data for ML 7. Ligand-based screening: machine learningWith the continuously increasing amount of available data, machine learning (ML) gained momentum in drugdiscovery and especially in ligand-based virtual screening (VS) to predict the activity of novel compoundsagainst a target of interest. In the following, different ML models are trained on the filtered ChEMBL dataset todiscriminate between active and inactive compounds with respect to a protein target. This workflow is part of the TeachOpenCADD pipeline: https://hub.knime.com/volkamerlab/space/TeachOpenCADDRead more on the theoretical background of this workflow on our TeachOpenCADD platform: https://projects.volkamerlab.org/teachopencadd/talktorials/T007_compound_activity_machine_learning.html TeachOpenCADD_plus_AutoML謝辞:TeachOpenCADD (TOC) のW7のSVMパートを転用させていただきました。また、AutoMLの使い方はJust KNIME It! で学びました。https://www.knime.com/just-knime-it先人の教えに感謝しつつ TeachOpenCADDのW2出力データを.tableファイルとしてworkflow内に格納 Step 3AutoMLでの機械学習自動化注意: TeachOpenCADDと同様に、検証用データを取り置いていません。 Step 4Evaluate models with ROC curves Generate fingerprint(default MACCS)Add boolean activity columnExtract columnsneeded for ML nodesSplit fingerprint to one bit per columnConvert activity to stringTOC_EGFR_4511compd.tableScore view+ROC CurvePredictTop : Train + Validation setsBottom : Test setexecute up-streambefore configuration RDKit Fingerprint Math Formula Column Filter Expand Bit Vector Number To String Scorer (JavaScript) Table Reader Evaluate model Workflow Executor Partitioning AutoML Step 1Split dataset intoactive & inactivecompounds (pIC50 cut-off = 6.3) Step 2Generate fingerprints and prepare data for ML 7. Ligand-based screening: machine learningWith the continuously increasing amount of available data, machine learning (ML) gained momentum in drugdiscovery and especially in ligand-based virtual screening (VS) to predict the activity of novel compoundsagainst a target of interest. In the following, different ML models are trained on the filtered ChEMBL dataset todiscriminate between active and inactive compounds with respect to a protein target. This workflow is part of the TeachOpenCADD pipeline: https://hub.knime.com/volkamerlab/space/TeachOpenCADDRead more on the theoretical background of this workflow on our TeachOpenCADD platform: https://projects.volkamerlab.org/teachopencadd/talktorials/T007_compound_activity_machine_learning.html TeachOpenCADD_plus_AutoML謝辞:TeachOpenCADD (TOC) のW7のSVMパートを転用させていただきました。また、AutoMLの使い方はJust KNIME It! で学びました。https://www.knime.com/just-knime-it先人の教えに感謝しつつ TeachOpenCADDのW2出力データを.tableファイルとしてworkflow内に格納 Step 3AutoMLでの機械学習自動化注意: TeachOpenCADDと同様に、検証用データを取り置いていません。 Step 4Evaluate models with ROC curves Generate fingerprint(default MACCS)Add boolean activity columnExtract columnsneeded for ML nodesSplit fingerprint to one bit per columnConvert activity to stringTOC_EGFR_4511compd.tableScore view+ROC CurvePredictTop : Train + Validation setsBottom : Test setexecute up-streambefore configuration RDKit Fingerprint Math Formula Column Filter Expand Bit Vector Number To String Scorer (JavaScript) Table Reader Evaluate model Workflow Executor Partitioning AutoML

Nodes

Extensions

Links