Icon

Compound Library Screening (ADME)

<p><strong>Compound Library Screening (ADME)</strong></p><p>When selecting compounds for a specific purpose like running an assay or developing a new drug, the starting point is usually a huge library of compounds that are either downloaded from a database or curated by the company or researcher themselves. This has to be narrowed down to avoid wasting time and money on testing every possible compound, many of which will ultimately fail during laboratory experiments or trials later on. <br>Ideally, pre-screening these compound lists results in a selection of only such substances that have a high probability to be suitable for the intended purpose, e.g., use in a specific assay for testing, or as active pharmaceutical ingredient against a disease target. In pharmacological research and practice, several sets of rules or guidelines have been developed to filter out compounds that are likely not suitable for their intended use. Common exclusion criteria in drug development are pharmacokinetic properties that typically make compounds unsuitable for safe administration to humans as drugs, e.g. negatively affect a drug's absorption, distribution, metabolism, and excretion (ADME).</p><p>This workflow automatically <strong>tags and filters a provided compound list </strong>in the form of a csv file containing a <strong>SMILES String</strong> representation of each compound as row entries. The whole tagged list or filtered compound lists are written as <strong>csv file outputs</strong> and can be visualized in an <strong>interactive dashboard</strong>. Filtering rules applied are:</p><ul><li><p>Lipinski's rule of five (also known as <strong>Rule of Five</strong>), and/or</p></li><li><p>the associated <strong>Rule of Three</strong>.</p></li></ul><p>Automating this process with KNIME provides researchers with an <strong>all-in-one platform solution</strong> from data read in to write out and visualization. The filtering process is repeatable, reproducible, well-documented and reliable, including necessary calculations free of human error or inconsistencies. Result files are automatically written, and the data is instantly displayed in an easy to understand and visually appealing way. All this without requiring additional analysis in another program or any prior coding knowledge, making data insights truly accessible for all stakeholders along the selection process of compounds for the next testing steps.</p><p><strong>Note</strong>: This workflow is based on the TeachOpenCADD workflow, more specifically Workflow 2 (ADME filter), from the KNIME Community Hub and zenodo. It uses the example data provided there, which is a list of active substances against Epidermal Growth Factor Receptor<strong> (EGFR)</strong>. The current workflow setting uses a reduced subset of the original dataset to make execution faster for demonstration purposes. If desired, reconfigure the <strong>CSV Reader</strong> node to read in the full dataset (W1_EGFR_compounds.csv) instead of the reduced one.</p>

URL: Teach Open CADD - Workflow 2 (ADME FIlter) https://hub.knime.com/knime/spaces/Life%20Sciences/Cheminformatics/Teaching/TeachOpenCADD_Workflow2_ADME_filter/TeachOpenCADD_Workflow2_ADME_filter~O-kjTyBZ321f_DSE/current-state
URL: Teach Open CADD - Master Workflow https://hub.knime.com/knime/spaces/Life%20Sciences/Cheminformatics/Teaching/TeachOpenCADD/TeachOpenCADD~xYhrR1mfFcGNxz7I/current-state
URL: Teach Open CADD (Zenodo Entry) https://zenodo.org/records/6636125

Filtering

Data Access

Data Cleaning and Preparation

Data Transformation / Enrichment / Calculations

Flag according to Lipinski's rule of five (RO5)

  • Hydrogen bond donors (HBD) <= 5

  • Hydrogen bond acceptors (HBA) <=10

  • Molecular Weight (MW) <= 500

  • Logarithm of partition coefficient (SlogP) <= 5

Flag according to rule of three (RO3)

  • Hydrogen bond donors (HBD) <= 3

  • Hydrogen bond acceptors (HBA) <= 3

  • Molecular Weight (MW) <= 300

  • Logarithm of partition coefficient (SlogP) <= 3

  • Rotatable bonds (RB) <= 3

Compound Library Screening (ADME)


When selecting compounds for a specific purpose like running an assay or developing a new drug, the starting point is usually a huge library of compounds that are either downloaded from a database or curated by the company or researcher themselves. This has to be narrowed down to avoid wasting time and money on testing every possible compound, many of which will ultimately fail during laboratory experiments or trials later on.
Ideally, pre-screening these compound lists results in a selection of only such substances that have a high probability to be suitable for the intended purpose, e.g., use in a specific assay for testing, or as active pharmaceutical ingredient against a disease target. In pharmacological research and practice, several sets of rules or guidelines have been developed to filter out compounds that are likely not suitable for their intended use. Common exclusion criteria in drug development are pharmacokinetic properties that typically make compounds unsuitable for safe administration to humans as drugs, e.g. negatively affect a drug's absorption, distribution, metabolism, and excretion (ADME).

This workflow automatically tags and filters a provided compound list in the form of a csv file containing a SMILES String representation of each compound as row entries. The whole tagged list or filtered compound lists are written as csv file outputs and can be visualized in an interactive dashboard. Filtering rules applied are:

  • Lipinski's rule of five (also known as Rule of Five), and/or

  • the associated Rule of Three.

Automating this process with KNIME provides researchers with an all-in-one platform solution from data read in to write out and visualization. The filtering process is repeatable, reproducible, well-documented and reliable, including necessary calculations free of human error or inconsistencies. Result files are automatically written, and the data is instantly displayed in an easy to understand and visually appealing way. All this without requiring additional analysis in another program or any prior coding knowledge, making data insights truly accessible for all stakeholders along the selection process of compounds for the next testing steps.

Note: This workflow is based on the TeachOpenCADD workflow, more specifically Workflow 2 (ADME filter), from the KNIME Community Hub and zenodo. It uses the example data provided there, which is a list of active substances against Epidermal Growth Factor Receptor (EGFR). The current workflow setting uses a reduced subset of the original dataset to make execution faster for demonstration purposes. If desired, reconfigure the CSV Reader node to read in the full dataset (W1_EGFR_compounds.csv) instead of the reduced one.

Visualization

File Writing

reads in a drug candidate list
CSV Reader
remove unneccessarycolumns
Column Filter
Interactive Dashboard
Calculates chosen properties from RDKit compatible/specific molecule entries
RDKit Descriptor Calculation
RO5-filtered compound list
CSV Writer
converts rendered molecules to imagefiles in svg format
RDKit Molecule to SVG
RO5 passed
Row Filter
writes total compound list with flags
CSV Writer
Checks each rule for every molecule (row)
Expression
RO3 passed
Row Filter
Converts SMILESdata type entries intoRDKit compatible/specific format
RDKit From Molecule
sums up numberof passed rules
Column Aggregator
Sort from mostto least rulesfulfilled
Sorter
tag RO5 pass/fail+subconditions from integer to String
Expression
excluded compounds
CSV Writer
Converts column "SmilesValue" from String to actual SMILES data type
Molecule Type Cast
Checks each rule for every molecule (row)
Expression
failed both rules
Row Filter
Sort from mostto least rulesfulfilled
Sorter
sums up numberof passed rules
Column Aggregator
tag RO3 passed
Expression
RO3-filteredcompound list
CSV Writer
Joiner

Nodes

Extensions

Links