RDKit Molecule Extractor

Splits up disconnected fragment molecules contained in a single RDKit molecule cell and extracts these molecules into separate cells, also sanitizing these molecules if desired. If the input cell is empty (missing), the input cell will be used as result with the appropriate reference column. If the input molecule contains only one fragment it will result in a single row. The node can either be used with an input table or based on flow variable input for the molecules and their format. Supported molecule formats are RDKit Mol cells (when connecting an input table), SMILES, MOL and SDF.
Please be aware that auto-conversion (e.g for SMILES input) may fail when connecting an input table.
The Advanced Tab offers different options to treat conversion failures, empty input cells and zero-atom molecules (empty molecules). You may configure the node to fail, to generate empty cells with or without warning, or to skip the input with or without warning.
The node can be used for instance after a Quickform Molecule Input node, which brings up a sketcher in the KNIME Web Portal. When the user draws multiple molecules at once this node will split up the users input into multiple molecules.

Options

Table Input

RDKit Mol column
The input column with RDKit Molecules, which may contain multiple disconnected fragments to be extracted.
Reference column (e.g. an ID)
The column to be used as reference column. Its values are assigned to to the cells with the extracted molecules. You may use the Row ID here or set it to None, in which case the reference column will not be added.

Variable/Data Input

Molecules
A textual representation of the molecules to be split. Usually, you will attach a flow variable to control this setting, e.g. coming from a Molecule Sketcher. This setting will only be used, if no table is connected.
Format
The format of the molecules: MOL, SDF or SMILES are supported. This setting will only be used, if no table is connected.

Output

Column name for extracted molecules
The name to be used for the new column used to store extracted molecules.
Column name for copied reference data
The name to be used for the new reference column used to store the reference values.

Advanced

Sanitize fragments
Flag to determine, if fragments shall be sanitized when being extracted. Selecting this option may lead to additional fragmentation errors. Default is false.
How to react on conversion errors
Define here, how the node shall behave, if an input molecule to be processed could not be converted into an RDKit molecule. Usually, this is the case if the molecule is invalid. By default, the node will generate a missing cell and no warning. If conversion requires special treatment, you may use the RDKit From Molecule node to perform the conversion before executing this node. It offers different options for conversion.
How to react on empty (missing) cells
Define here, how the node shall behave, if an empty (missing) cell is used as input. By default, the node will generate a missing cell and no warning.
How to react on empty (zero atom) molecules
Define here, how the node shall behave, if an empty molecule is used as input. This means that the molecule has zero atoms. By default, the node will generate a missing cell and no warning.

Input Ports

Icon
Input table with RDKit Molecules, which may contain disconnected fragment molecules.

Output Ports

Icon
Output table with extracted fragment molecules.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.