Icon

ML Example on Bioassay Data

This workflow demonstrates how to train several ML classifiers on bioassay data and compare their performance on a corresponding validation set.

In this example, the data is taken from REDIAL-2020 (see external resources) and was generated using the angiotensin-converting enzyme 2 (ACE2) enzymatic activity assay. The training data contains features/labels for 228 compounds whereas the validation data contains features/labels for 49 compounds. Labels indicate the activity of the compound (1 Active, 0 Inactive) and the features come from functional-class fingerprints (in particular FCFP6).

Python Scripts are necessary here to load the data (originally in .npy format) into a format usable by KNIME.

NOTE: This workflow was designed on KNIME version 5.2.0. Issues will arise when trying to run this workflow on an earlier version of KNIME. Update KNIME to 5.2.0 (or newer) to use this workflow.

URL: REDIAL-2020 (GitHub) https://github.com/sirimullalab/redial-2020
URL: REDIAL-2020 (Paper) https://www.nature.com/articles/s42256-021-00335-w
URL: Description of FCFP6 https://cheminf20.org/2014/02/21/open-source-ecfpfcfp-circular-fingerprints-in-cdk/
URL: Description of ACE2 https://www.ncbi.nlm.nih.gov/gene/59272

Nodes

Extensions

Links