Icon

Model_​Factory

Monster Model Factory

This workflow is based on abstract KNIME Model Factory and is adapted to Life Sciences use case. Please refer to the blogpost for further details. The training data for the models comes from ChEMBLdb (https://www.ebi.ac.uk/chembl/beta/) and is publicly accessible. The workflow uses a sample of data for the test run. To run the workflow for the full set, remove Row Sampling node.To run this workflow on a system with distributed executors define the size of chunks in the Parallel Chunk Loop Start node depending on the resources of your cluster. Don't forget to leave a few executors for the management of the resources. E.g. for a cluster with 10 executors with 2 cores x 2 processors, a good number of chunks would be 36.

Initialize: Define the data to work with The Monster Model FactoryThis workflow is based on abstract KNIME Model Factory and is adapted to a Life Sciences use case. Please refer to the blog post for further details.The training data for the models comes from ChEMBLdb (https://www.ebi.ac.uk/chembl/beta/) and is publicly accessible. We provide the data in Metainfo/Bioactivity/Training_data folderThe workflow uses a sample of data for the test run. To run the workflow for the full set, remove Row Sampling node.To run this workflow on a system with distributed executors define the size of chunks in the Parallel Chunk Loop Start node depending on the resources of your cluster. Don't forget to leave a few executors for the management of the resources. E.g. for a cluster with 10executors with 2 cores x 2 processors, a good number of chunks would be 36.Important Disclaimer: this workflow needs configuration and won't execute until several pieces are configured. See red box for more details below!We ship the workflow repository with a sample of training set. To access the full training data set see red box for more details below! Disclaimer: This workflow NEEDS configuration and won’t execute until several pieces are configured, listed below:1. Connection to KNIME Server2. Copy the workflow to your local machine and start execution of Model_factory workflow locally, node by node.3. Configure the Select model process definition metanode4. Configure the Call Remote Workflow nodes in Metanodes Load, Transform, Train and Score: select the Server, authentication, the workflow to execute, json_column as input5. Reset and save the workflow locally. 6. Copy the whole workflow repository to your KNIME Server7. Right click the Model_Factory workflow on the Server and select Execute8. Access the full training set by coping the link below to your browser https://workflows.knime.com/knime/rest/v4/repository/50_Applications/37_Monster_Model_Factory:data (use Google Chrome). Login using your KNIME forum account. This will download theMonster_Model_Factory workflows with the full training set to your local machine. Change metricchange Responsibleloop over the assaysWF2WF1change server folderpath HERE1 (re-)learn model2 nothing to doWF3point MF to the assay_details tablewith locationstake 2 top for testingremove node for production Get modelconfiguration table Table Row ToVariable Loop Start Loop End Transform Load Select modelprocess definition IF Switch End IF Train and Score Cleanup and collect Extract Assay_IDsfrom Database Deploy Delete Old Models DefineTraining Type Parallel Chunk End ParallelChunk Start Row Sampling Initialize: Define the data to work with The Monster Model FactoryThis workflow is based on abstract KNIME Model Factory and is adapted to a Life Sciences use case. Please refer to the blog post for further details.The training data for the models comes from ChEMBLdb (https://www.ebi.ac.uk/chembl/beta/) and is publicly accessible. We provide the data in Metainfo/Bioactivity/Training_data folderThe workflow uses a sample of data for the test run. To run the workflow for the full set, remove Row Sampling node.To run this workflow on a system with distributed executors define the size of chunks in the Parallel Chunk Loop Start node depending on the resources of your cluster. Don't forget to leave a few executors for the management of the resources. E.g. for a cluster with 10executors with 2 cores x 2 processors, a good number of chunks would be 36.Important Disclaimer: this workflow needs configuration and won't execute until several pieces are configured. See red box for more details below!We ship the workflow repository with a sample of training set. To access the full training data set see red box for more details below! Disclaimer: This workflow NEEDS configuration and won’t execute until several pieces are configured, listed below:1. Connection to KNIME Server2. Copy the workflow to your local machine and start execution of Model_factory workflow locally, node by node.3. Configure the Select model process definition metanode4. Configure the Call Remote Workflow nodes in Metanodes Load, Transform, Train and Score: select the Server, authentication, the workflow to execute, json_column as input5. Reset and save the workflow locally. 6. Copy the whole workflow repository to your KNIME Server7. Right click the Model_Factory workflow on the Server and select Execute8. Access the full training set by coping the link below to your browser https://workflows.knime.com/knime/rest/v4/repository/50_Applications/37_Monster_Model_Factory:data (use Google Chrome). Login using your KNIME forum account. This will download theMonster_Model_Factory workflows with the full training set to your local machine. Change metricchange Responsibleloop over the assaysWF2WF1change server folderpath HERE1 (re-)learn model2 nothing to doWF3point MF to the assay_details tablewith locationstake 2 top for testingremove node for production Get modelconfiguration table Table Row ToVariable Loop Start Loop End Transform Load Select modelprocess definition IF Switch End IF Train and Score Cleanup and collect Extract Assay_IDsfrom Database Deploy Delete Old Models DefineTraining Type Parallel Chunk End ParallelChunk Start Row Sampling

Nodes

Extensions

Links