Compound Library Screening (ADME)
When selecting compounds for a specific purpose like running an assay or developing a new drug, the starting point is usually a huge library of compounds that are either downloaded from a database or curated by the company or researcher themselves. This has to be narrowed down to avoid wasting time and money on testing every possible compound, many of which will ultimately fail during laboratory experiments or trials later on.
Ideally, pre-screening these compound lists results in a selection of only such substances that have a high probability to be suitable for the intended purpose, e.g., use in a specific assay for testing, or as active pharmaceutical ingredient against a disease target. In pharmacological research and practice, several sets of rules or guidelines have been developed to filter out compounds that are likely not suitable for their intended use. Common exclusion criteria in drug development are pharmacokinetic properties that typically make compounds unsuitable for safe administration to humans as drugs, e.g. negatively affect a drug's absorption, distribution, metabolism, and excretion (ADME).
This workflow automatically tags and filters a provided compound list in the form of a csv file containing a SMILES String representation of each compound as row entries. The whole tagged list or filtered compound lists are written as csv file outputs and can be visualized in an interactive dashboard. Filtering rules applied are:
Automating this process with KNIME provides researchers with an all-in-one platform solution from data read in to write out and visualization. The filtering process is repeatable, reproducible, well-documented and reliable, including necessary calculations free of human error or inconsistencies. Result files are automatically written, and the data is instantly displayed in an easy to understand and visually appealing way. All this without requiring additional analysis in another program or any prior coding knowledge, making data insights truly accessible for all stakeholders along the selection process of compounds for the next testing steps.
Note: This workflow is based on the TeachOpenCADD workflow, more specifically Workflow 2 (ADME filter), from the KNIME Community Hub and zenodo. It uses the example data provided there, which is a list of active substances against Epidermal Growth Factor Receptor (EGFR). The current workflow setting uses a reduced subset of the original dataset to make execution faster for demonstration purposes. If desired, reconfigure the CSV Reader node to read in the full dataset (W1_EGFR_compounds.csv) instead of the reduced one.