This workflow snippet shows how to standardize chemical structures in SMILES format using the open-source RDKit nodes.
The steps of standardization and data cleaning comprise
1. the removal of hydrogens
2. the removal of solvents
3. the stripping of salts
4. structure normalization
5. canonicalization
Please note that while we read in the molecules as a KNIME-native table, this is also applicable to data of all kind of formats read in with other readers, e.g. SMILES, SDF or Mol. We remove explicit hydrogens here in the first step for the sake of demonstration, but this is actually done under the hood by any RDKit node. The Salt Stripper node is used twice, once to remove any user-given solvents, and once to remove pre-defined salts. Note that the removal of salts could also be done with the Structure Normalizer node. The canonicalization constitutes the last step in this workflow.
The dataset represents a subset of 844 compounds evaluated for activity against CDPK1. More information is available https://chembl.gitbook.io/chembl-ntd/#deposited-set-19-5th-march-2016-uw-kinase-screening-hits. See Set 19
To use this workflow in KNIME, download it from the below URL and open it in KNIME:
Download WorkflowDeploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com.
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.