Apply Transforms (RDKit) (Experimental)

This node transforms input structures according to the Matched Molecular Pair transforms generated in the incoming table. If a transform matches more than one position in the molecule, then the transform is applied to each position singly - multiple transformation combinations are not applied. The implementation is still experimental at present and new features may be added in future versions

The transforms can be filtered according to environment similarity, in which case a transform is applied whenever any set of fingerprints (if the transform occurs more than once) match the criteria. Where the transform matches multiple positions in the molecule, only those positions which match the criteria are transformed

The output table will always contain at least 3 columns - the transformed molecule, the incoming molecule and the rSMARTS transform applied to effect the transformation. Additionally, columns from the transform table and molecules table can also be passed through. In the case of transform table columns, these will be grouped on the transform into collection cells.

NB - Multiple different but overlapping transforms may transform a molecule into the same structure - the node does not check for this scenario (e.g. [*:1]-!@OC>>[*:1]OC(F)(F)F and [*:1]-!@C>>[*:1]C(F)(F)F) will both transform PhOMe to PhOCF3

This node was developed by Vernalis Research. For feedback and more information, please contact knime@vernalis.com

Options

Molecule Options

Select Molecule column
Select the column containing the incoming molecules to transform
Molecule pass-through columns
Any columns associated with the molecule to be kept in the output table. The Molecule column is always kept, regardless of it's position here

Transform Options

Select transform column
The column containing the rSMARTS Transforms for each Matched Pair
Transforms are sorted
Checking this indicates that the transform table is pre-sorted by the rSMARTS column. Checking this option when it is not may result in a transform being applied multiple times.
Attempt to create enantiomeric products
Where a transform could generate a pair of enantiomeric products, should this be attempted? There is a time penalty as each transform has to be applied 2^n times (where n is the number of attachment points in the transform), and there is no way of knowing in advance which positions need to be duplicated. For example, the transform '[*:1]-!@[H]>>[*:1]-[Cl]' when applied to the SMILES string 'N1CCCC1' can generate, amongst others, the enantiomeric pair 'N1[C@H]([Cl])CCC1' and 'N1[C@@H]([Cl])CCC1' - without selecting this option, only 'N1C(Cl)CCC1' will be generated. NB At present, double bond geometry is not created, and only the atom which replaces the attachment point is enantomerically enumerated due to toolkit limitations
Transform pass-through columns
Any columns associated with the transform to be kept in the output table. The transform column is always kept, regardless of the setting here
Filter by transform environment
Should the transform only be applied to environments matching the attachment point key fingerprints according to the criteria specified?
First Key Attachment point Fingerprint Column
The first attachment point column index for the transform reactant ('(L)')
Similarity metric settings
The similarity comparison type. For the asymmetric Tversky similarity, the similarity comparison is from the transform fingerprint to the molecule fingerprint
Threshold
The minimum similarity to allow
Alpha
Tversky similarity α coefficient
Beta
The Tversky similarity β coefficient
AP Fingerprint Comparison Type
When there is more than 1 cut, this setting determines how the comparison will be performed - requiring all attachment point or any attachment point environment to match, or the overall concatenated environment to match

Input Ports

Icon
The table containing molecules to transform
Icon
The table containing transforms

Output Ports

Icon
Table containing transformed molecules. The table is ordered by transform then by input molecule

Views

Progress view
A view showing the overall node execution progress, along with the progress of individual current transforms. The tool tip for the transform ID shows the transform SMARTS
Enhanced progress view
A view showing the overall node execution progress, along with the progress of individual current transforms. The transforms are rendered using the default SMARTS cell renderer. Holding down the 'Shift' key whilst resizing the column preserves/restores the 2:1 aspect ratio

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.