Icon

04_​Databased_​MMP_​Example

MMP Databased Example with Fingerprint Databasing

This is a more complex Matched Molecular Pairs (MMP) example, in which we demonstrate storing fragmentations and transforms in a database, and adding new fragmentations and pairs from new molecules, as might be performed with routine updates to a registration or compound database.
The datasets are based on the same data as the 'Simple MMP Example'. The first represents an existing compound set, and the second a set of new compounds (seeded with a few duplicates).
We use the multi-cut version of the node to perform 1-10 cuts on the 'existing' set first and we generate pair transforms. Fragments, transforms and failed rows are all databased for later recall (here we use an sqlite db in the same folder that contains the workflow - run this locally to create the DB file - it is quite big!)

The Vernalis fingerprint nodes are used to regenerate fingerprint columns from the databased representations

The first section of the workflow initialise a database containing fragmentations and transforms.The second section simulates fragmenting some new molecules which have been 'salted' with some duplicate structures (previously removed). Second database built simulating some new molecules which have been 'salted' with some duplicate structures - we removed those first!We read in the existing fragments table and restore the fingerprint columns from their string types using some new Vernalis fingerprint nodes (as of v1.12.0), and the SMILES cells back to their typesWe fragment as above, and then use two pair generation nodes:• The 'Fragments to MMPs' as per the initialisation to generate new pairs from within the table of new fragments• The 'Reference Fragments to MMPs' node, configured with the same transform filters as the Fragments to MMPs nodes, with the databased fragments table as the first 'Reference' table, and the new fragments asthe second 'Query' tableWe see that we generate no pairs from within the new compounds set, but an additional 38 transforms from the reference pair node Demo purposes creates SQL-lite db in the /data folder in theworkflow in the workspace.Configure if required to adifferent location TO USE1. Configure the SQLite Connector if required to a different location2. Write the DB tables for the top rows by running the 3 DatabaseWriter nodes annotatated 'Write ...'3. Now Run the lower half of the workflow Database with fragmentations and transformations initialised. not testedI can't manage to weite a Smile object intothe database (first DB Writer node) Only fragmenteach structure onceOnly fragmenteach structure oncetop -'Original' Datasetbottom - 'New Compounds'(Some duplicates between tablesBetween new and oldWithin new fragmentsWrite fragmentationsWrite PairsWrite FailuresRead previous fragmentsBEFORE Writing new ones!Append fragmentationsAppend PairsAppend FailuresGroupBy Speedy SMILESDe-salt MMP Molecule Multi-cutFragment (RDKit) Speedy SMILESDe-salt GroupBy MMP Molecule Multi-cutFragment (RDKit) Column Filter Fragments to MMPs Read sample date Remove 'Knowns' Column Filter ReferenceFragments to MMPs Fragments to MMPs Restore columntypes from db Concatenate SQLite Connector DB Writer DB Writer DB Writer DB Query Reader DB Writer DB Writer DB Writer The first section of the workflow initialise a database containing fragmentations and transforms.The second section simulates fragmenting some new molecules which have been 'salted' with some duplicate structures (previously removed). Second database built simulating some new molecules which have been 'salted' with some duplicate structures - we removed those first!We read in the existing fragments table and restore the fingerprint columns from their string types using some new Vernalis fingerprint nodes (as of v1.12.0), and the SMILES cells back to their typesWe fragment as above, and then use two pair generation nodes:• The 'Fragments to MMPs' as per the initialisation to generate new pairs from within the table of new fragments• The 'Reference Fragments to MMPs' node, configured with the same transform filters as the Fragments to MMPs nodes, with the databased fragments table as the first 'Reference' table, and the new fragments asthe second 'Query' tableWe see that we generate no pairs from within the new compounds set, but an additional 38 transforms from the reference pair node Demo purposes creates SQL-lite db in the /data folder in theworkflow in the workspace.Configure if required to adifferent location TO USE1. Configure the SQLite Connector if required to a different location2. Write the DB tables for the top rows by running the 3 DatabaseWriter nodes annotatated 'Write ...'3. Now Run the lower half of the workflow Database with fragmentations and transformations initialised. not testedI can't manage to weite a Smile object intothe database (first DB Writer node) Only fragmenteach structure onceOnly fragmenteach structure oncetop -'Original' Datasetbottom - 'New Compounds'(Some duplicates between tablesBetween new and oldWithin new fragmentsWrite fragmentationsWrite PairsWrite FailuresRead previous fragmentsBEFORE Writing new ones!Append fragmentationsAppend PairsAppend Failures GroupBy Speedy SMILESDe-salt MMP Molecule Multi-cutFragment (RDKit) Speedy SMILESDe-salt GroupBy MMP Molecule Multi-cutFragment (RDKit) Column Filter Fragments to MMPs Read sample date Remove 'Knowns' Column Filter ReferenceFragments to MMPs Fragments to MMPs Restore columntypes from db Concatenate SQLite Connector DB Writer DB Writer DB Writer DB Query Reader DB Writer DB Writer DB Writer

Nodes

Extensions

Links