Icon

03_​Bioactivity_​Prediction_​Learn_​All_​Methods

03_Bioactivity_Prediction_Learn_All_Methods

This workflow is used for learning the model. It deploys a complex parameter optimzation protocol for model building where four machine learning methods are tested with five sets of features during a parameter optimization cycle. The parameter optimization cycle is repeated 10 times.

03_Bioactivity_Prediction_Learn_All_Methods This workflow is used for learning the model. It deploys a complex parameter optimzation protocol for model building where four machine learning methods are tested with five sets of features during a parameteroptimization cycle. The parameter optimization cycle is repeated 10 times. Input: It will get the input file as prepared by Transform workflow andOutput: The learned model with its statistics in //Metainfo/Bioactivity/model_output_ASSAY_ID_timestamp.table and the statistics for the best model per each method and iteration in //Metainfo/Bioactivity/best_models_stats_ASSAY_ID_timestamp.table. Parameter optimization for each method is performed on 80% of the original dataset. The optimization workflows are hidden in grey wrapped metanodes which carry the name of the machine learning methods.Parameters leading to the highest enrichment factor on 5% of the data set are picked to build the best model. Finally this model is scored using 20% of the dataset (that was not part of optimization cycle) and the bestmodel is selected. Framework (Connection to Model Factory) Output to Model Factory Custom Workflow forModel Building StepParameter optimization for each method is performed on 80% of the original dataset. The optimization workflows are hidden in grey wrapped metanodes which carry the name of the machine learning methods. This cycle repeats 10 times. Parameters leading to the highest enrichment factor on 5% of the data set are picked to build the best model. Finally this model is scored using 20% of the dataset (that was not part of optimization cycle) and the best model is selected. RESULTSelected model10 timesload_output_assayid.table80/20random stratifiedCollect Stats of best modelpunctuation in timeseedwith iterationsADDITIONALall_models_stats JSON to Table Table Rowto Variable Table Writer Counting Loop Start Table Reader Partitioning Logistic Regression Naive Bayes H2O GradientBoosting Random Forest String Manipulation(Variable) Create Date&TimeRange Table Rowto Variable ContainerInput (JSON) Loop End String Replacer Math Formula(Variable) Concatenate(Optional in) String Manipulation(Variable) Table Writer Sort and Group Build the Bestand Score Select the mostCommon model 03_Bioactivity_Prediction_Learn_All_Methods This workflow is used for learning the model. It deploys a complex parameter optimzation protocol for model building where four machine learning methods are tested with five sets of features during a parameteroptimization cycle. The parameter optimization cycle is repeated 10 times. Input: It will get the input file as prepared by Transform workflow andOutput: The learned model with its statistics in //Metainfo/Bioactivity/model_output_ASSAY_ID_timestamp.table and the statistics for the best model per each method and iteration in //Metainfo/Bioactivity/best_models_stats_ASSAY_ID_timestamp.table. Parameter optimization for each method is performed on 80% of the original dataset. The optimization workflows are hidden in grey wrapped metanodes which carry the name of the machine learning methods.Parameters leading to the highest enrichment factor on 5% of the data set are picked to build the best model. Finally this model is scored using 20% of the dataset (that was not part of optimization cycle) and the bestmodel is selected. Framework (Connection to Model Factory) Output to Model Factory Custom Workflow forModel Building StepParameter optimization for each method is performed on 80% of the original dataset. The optimization workflows are hidden in grey wrapped metanodes which carry the name of the machine learning methods. This cycle repeats 10 times. Parameters leading to the highest enrichment factor on 5% of the data set are picked to build the best model. Finally this model is scored using 20% of the dataset (that was not part of optimization cycle) and the best model is selected. RESULTSelected model10 timesload_output_assayid.table80/20random stratifiedCollect Stats of best modelpunctuation in timeseedwith iterationsADDITIONALall_models_stats JSON to Table Table Rowto Variable Table Writer Counting Loop Start Table Reader Partitioning Logistic Regression Naive Bayes H2O GradientBoosting Random Forest String Manipulation(Variable) Create Date&TimeRange Table Rowto Variable ContainerInput (JSON) Loop End String Replacer Math Formula(Variable) Concatenate(Optional in) String Manipulation(Variable) Table Writer Sort and Group Build the Bestand Score Select the mostCommon model

Nodes

Extensions

Links