0 ×

04_​Databased_​MMP_​Example

Workflow

MMP Databased Example with Fingerprint Databasing
This is a more complex Matched Molecular Pairs (MMP) example, in which we demonstrate storing fragmentations and transforms in a database, and adding new fragmentations and pairs from new molecules, as might be performed with routine updates to a registration or compound database. The Vernalis fingerprint nodes are used to regenerate fingerprint columns from the databased representations
Matched Molecular Pairs MMP MMPA Fingerprint Database
Example of Databasing Fragmentations and Matched PairsFirstly we set up two datasets based on the same data as the 'Simple MMP Example'. The first represents an existing compound set, and thesecond a set of new compounds (seeded with a few duplicates).We use the multi-cut version of the node to perform 1-10 cuts on the 'existing' set first: INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 Fragmentation SMIRKS: [#6+0;!$(*=,#[!#6]):1]!@!=!#[*:2]>>[*:1]-[*].[*:2]-[*] (Upto 10 cuts) INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 Using 10 threads and 1000 queue items to parallel process... INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 Starting fragmentation at Tue Jul 11 17:20:55 BST 2017 INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 Fragmentation completed at Tue Jul 11 17:22:01 BST 2017 INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 2378 rows fragmented in 1mins 6.02s INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 97245 fragments produced INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 384 rows rejected INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 MMP Molecule Multi-cut Fragment (RDKit) 0:5 End execute (1 min, 6 secs)And we generate pair transforms. Fragments, transforms and failed rows are all databased for later recall (here we use an sqlite db in the samefolder that contains the workflow - run this locally to create the DB file - it is quite big!) In the top half of the workflow (all the nodes above the level of the 'SQLite Connector' node), we initialised a database containingfragmentations and transforms.In the lower half, we simulate fragmenting some new molecules which have been 'salted' with some duplicate structures - we remove thosefirst!We read in the existing fragments table and restore the fingerprint columns from their string types using some new Vernalis fingerprintnodes (as of v1.12.0), and the SMILES cells back to their typesWe fragment as above, and then use two pair generation nodes:• The 'Fragments to MMPs' as per the initialisation to generate new pairs from within the table of new fragments• The 'Reference Fragments to MMPs' node, configured with the same transform filters as the Fragments to MMPs nodes, with thedatabased fragments table as the first 'Reference' table, and the new fragments as the second 'Query' tableWe see that we generate no pairs from within the new compounds set, but an additional 38 transforms from the reference pair node REQUIRES v1.12.0 or higher of theVernalis KNIME Extension Demo purposes creates SQL-lite db inthe /data folder in the workflow in theworkspace TO USE1. Configure the SQLite Connector if required toa different location2. Write the DB tables for the top rows by runningthe 3 Database Writer nodes annotatated'Write ...'3. Now Run the lower half of the workflow Only fragmenteach structure onceNode 4Node 5Node 6Write fragmentationsNode 11Only fragmenteach structure onceNode 13Write FailuresNode 16Node 26top -'Original' Datasetbottom - 'New Compounds'(Some duplicates between tablesWrite PairsNode 30Append fragmentationsAppend PairsAppend FailuresNode 34Read previous fragmentsBEFOREWriting new ones!Between new and oldWithin new fragmentsNode 51Node 52 GroupBy Speedy SMILESDe-salt MMP Molecule Multi-cutFragment (RDKit) SQLite Connector Database Writer Speedy SMILESDe-salt GroupBy MMP Molecule Multi-cutFragment (RDKit) Database Writer Column Filter Fragments to MMPs Read sample date Database Writer Remove 'Knowns' Database Writer Database Writer Database Writer Column Filter Database Reader ReferenceFragments to MMPs Fragments to MMPs Restore columntypes from db Concatenate Example of Databasing Fragmentations and Matched PairsFirstly we set up two datasets based on the same data as the 'Simple MMP Example'. The first represents an existing compound set, and thesecond a set of new compounds (seeded with a few duplicates).We use the multi-cut version of the node to perform 1-10 cuts on the 'existing' set first: INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 Fragmentation SMIRKS: [#6+0;!$(*=,#[!#6]):1]!@!=!#[*:2]>>[*:1]-[*].[*:2]-[*] (Upto 10 cuts) INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 Using 10 threads and 1000 queue items to parallel process... INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 Starting fragmentation at Tue Jul 11 17:20:55 BST 2017 INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 Fragmentation completed at Tue Jul 11 17:22:01 BST 2017 INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 2378 rows fragmented in 1mins 6.02s INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 97245 fragments produced INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 384 rows rejected INFO MMP Molecule Multi-cut Fragment (RDKit) 0:5 MMP Molecule Multi-cut Fragment (RDKit) 0:5 End execute (1 min, 6 secs)And we generate pair transforms. Fragments, transforms and failed rows are all databased for later recall (here we use an sqlite db in the samefolder that contains the workflow - run this locally to create the DB file - it is quite big!) In the top half of the workflow (all the nodes above the level of the 'SQLite Connector' node), we initialised a database containingfragmentations and transforms.In the lower half, we simulate fragmenting some new molecules which have been 'salted' with some duplicate structures - we remove thosefirst!We read in the existing fragments table and restore the fingerprint columns from their string types using some new Vernalis fingerprintnodes (as of v1.12.0), and the SMILES cells back to their typesWe fragment as above, and then use two pair generation nodes:• The 'Fragments to MMPs' as per the initialisation to generate new pairs from within the table of new fragments• The 'Reference Fragments to MMPs' node, configured with the same transform filters as the Fragments to MMPs nodes, with thedatabased fragments table as the first 'Reference' table, and the new fragments as the second 'Query' tableWe see that we generate no pairs from within the new compounds set, but an additional 38 transforms from the reference pair node REQUIRES v1.12.0 or higher of theVernalis KNIME Extension Demo purposes creates SQL-lite db inthe /data folder in the workflow in theworkspace TO USE1. Configure the SQLite Connector if required toa different location2. Write the DB tables for the top rows by runningthe 3 Database Writer nodes annotatated'Write ...'3. Now Run the lower half of the workflow Only fragmenteach structure onceNode 4Node 5Node 6Write fragmentationsNode 11Only fragmenteach structure onceNode 13Write FailuresNode 16Node 26top -'Original' Datasetbottom - 'New Compounds'(Some duplicates between tablesWrite PairsNode 30Append fragmentationsAppend PairsAppend FailuresNode 34Read previous fragmentsBEFOREWriting new ones!Between new and oldWithin new fragmentsNode 51Node 52 GroupBy Speedy SMILESDe-salt MMP Molecule Multi-cutFragment (RDKit) SQLite Connector Database Writer Speedy SMILESDe-salt GroupBy MMP Molecule Multi-cutFragment (RDKit) Database Writer Column Filter Fragments to MMPs Read sample date Database Writer Remove 'Knowns' Database Writer Database Writer Database Writer Column Filter Database Reader ReferenceFragments to MMPs Fragments to MMPs Restore columntypes from db Concatenate

Download

Get this workflow from the following link: Download

Nodes

04_​Databased_​MMP_​Example consists of the following 44 nodes(s):

Plugins

04_​Databased_​MMP_​Example contains nodes provided by the following 6 plugin(s):