Icon

TOC_​W7_​plus_​Bayesian_​Parameter_​Opt

TOC_W7_plus_Bayesian_Parameter_Opt
Bayesでのハイパーパラメータ最適化: TeachOpenCADDのSupport vector machine (SVM) classifierモデル注意: 計算環境によっては一晩で終わりません Step 1Split dataset intoactive & inactivecompounds (pIC50 cut-off = 6.3) Step 2Generate fingerprints and prepare data for ML Step 3Train ML model using k-cross validation (default 10-fold) Support vector machine classifier 7. Ligand-based screening: machine learningWith the continuously increasing amount of available data, machine learning (ML) gained momentum in drugdiscovery and especially in ligand-based virtual screening (VS) to predict the activity of novel compoundsagainst a target of interest. In the following, different ML models are trained on the filtered ChEMBL dataset todiscriminate between active and inactive compounds with respect to a protein target. This workflow is part of the TeachOpenCADD pipeline: https://hub.knime.com/volkamerlab/space/TeachOpenCADDRead more on the theoretical background of this workflow on our TeachOpenCADD platform: https://projects.volkamerlab.org/teachopencadd/talktorials/T007_compound_activity_machine_learning.html TeachOpenCADD_plus_Bayesian_Parameter_Optimization謝辞:TeachOpenCADD (TOC) のW7のSVMパートを転用させていただきました。また、ハイパーパラメータ最適化の部分はha-te-knimeさんの下記ブログからいただいています。https://hateknime.hatenablog.com/entry/2019/08/03/140509先人の教えに感謝しつつ TeachOpenCADDのW2出力データを.tableファイルとしてworkflow内に格納 Bayes最適化済SVMモデルの予測結果注意: TeachOpenCADDと同様に、検証用データを取り置いていません。 Generate fingerprint(default MACCS)Add boolean activity columnExtract columnsneeded for ML nodesSplit fingerprint to one bit per columnConvert activity to stringベイズ最適化ループ処理C(Overlapping penalty)算出σ(sigma)算出最適化結果を集計しBest Parameterを選定予測精度のスコア化今回はOverall accuracyを最適化指標に選定10分割交差検証ループ検証結果集計Best Parameterを変数へC算出σ (sigma) 算出TOC_EGFR_4511compd.tableScore view+ROC Curve X-Aggregator X-Partitioner RDKit Fingerprint Math Formula SVM Learner SVM Predictor Column Filter Expand Bit Vector Number To String Parameter OptimizationLoop Start Math Formula(Variable) Math Formula(Variable) ParameterOptimization Loop End Scorer (JavaScript) Table Columnto Variable X-Partitioner SVM Learner SVM Predictor X-Aggregator Table Rowto Variable Math Formula(Variable) Math Formula(Variable) Scorer (JavaScript) Table Reader Evaluate model Bayesでのハイパーパラメータ最適化: TeachOpenCADDのSupport vector machine (SVM) classifierモデル注意: 計算環境によっては一晩で終わりません Step 1Split dataset intoactive & inactivecompounds (pIC50 cut-off = 6.3) Step 2Generate fingerprints and prepare data for ML Step 3Train ML model using k-cross validation (default 10-fold) Support vector machine classifier 7. Ligand-based screening: machine learningWith the continuously increasing amount of available data, machine learning (ML) gained momentum in drugdiscovery and especially in ligand-based virtual screening (VS) to predict the activity of novel compoundsagainst a target of interest. In the following, different ML models are trained on the filtered ChEMBL dataset todiscriminate between active and inactive compounds with respect to a protein target. This workflow is part of the TeachOpenCADD pipeline: https://hub.knime.com/volkamerlab/space/TeachOpenCADDRead more on the theoretical background of this workflow on our TeachOpenCADD platform: https://projects.volkamerlab.org/teachopencadd/talktorials/T007_compound_activity_machine_learning.html TeachOpenCADD_plus_Bayesian_Parameter_Optimization謝辞:TeachOpenCADD (TOC) のW7のSVMパートを転用させていただきました。また、ハイパーパラメータ最適化の部分はha-te-knimeさんの下記ブログからいただいています。https://hateknime.hatenablog.com/entry/2019/08/03/140509先人の教えに感謝しつつ TeachOpenCADDのW2出力データを.tableファイルとしてworkflow内に格納 Bayes最適化済SVMモデルの予測結果注意: TeachOpenCADDと同様に、検証用データを取り置いていません。 Generate fingerprint(default MACCS)Add boolean activity columnExtract columnsneeded for ML nodesSplit fingerprint to one bit per columnConvert activity to stringベイズ最適化ループ処理C(Overlapping penalty)算出σ(sigma)算出最適化結果を集計しBest Parameterを選定予測精度のスコア化今回はOverall accuracyを最適化指標に選定10分割交差検証ループ検証結果集計Best Parameterを変数へC算出σ (sigma) 算出TOC_EGFR_4511compd.tableScore view+ROC Curve X-Aggregator X-Partitioner RDKit Fingerprint Math Formula SVM Learner SVM Predictor Column Filter Expand Bit Vector Number To String Parameter OptimizationLoop Start Math Formula(Variable) Math Formula(Variable) ParameterOptimization Loop End Scorer (JavaScript) Table Columnto Variable X-Partitioner SVM Learner SVM Predictor X-Aggregator Table Rowto Variable Math Formula(Variable) Math Formula(Variable) Scorer (JavaScript) Table Reader Evaluate model

Nodes

Extensions

Links