This node implements the Hussain and Rea algorithm for finding
Matched Molecular Pairs in a dataset (See Ref. 1). The user can
of cuts to be made (1 - 10), and whether Hydrogens
A variety of fragmentation options are included:
- "All acyclic single bonds" - Any acyclic single bond between
any two atoms will be broken. This is the most exhaustive
approach, but can generate a large number of pairs (rSMARTS:
- "Only acyclic single bonds to rings" - Single acyclic bonds
between any atoms will be broken, as long as at least one atom is
in a ring (rSMARTS: [*;R:1]!@!=!#[*:2]>>[*:1]-[*].[*:2]-[*]).
- "Only single bonds to a heteroatom" - Single acyclic bonds
between any two atoms, at least one of which is not Carbon will be
broken. Included to mirror C-X bond breaking chemistry prevalent
in modern drug discovery (e.g. SNAr, Reductive Aminations, Amide
formations etc. See Ref. 2) (rSMARTS:
- "Non-functional group single bonds" - This
fragmentation pattern used in the original
Hussein/Rea paper (See
footnote 24, Ref. 1), and also used in the
implementation (Ref 3) (rSMARTS:
- "User defined" - The user needs to provide their own rSMARTS
fragmentation definition, following the guidelines below.
Guidelines for Custom rSMARTS Definition
- '>>' is required to separate reactants and products
- Products require '[*]' to occur twice, for the attachment
points (the node will handle the tagging of these)
- Reactants and products require exactly two atom mappings, e.g.
:1] and :2] (other values could be used).
- The atom mappings must be two different values
- The same atom mappings must be used for reactants and products
rSMARTS not conforming to these guidelines will be rejected during
Optionally, when only a single cut is made, or connectivity
tracking is enabled, context-fingerprints can be generated (one for
each attachment point). The fingerprints generated are RDKit Morgan
fingerprints, rooted at the attachment point(s) of the unchanging
The algorithm is implemented using the RDKit toolkit.
This node was developed by
For feedback and more information, please contact
1.J. Hussain and C Rea, "
Computationally efficient algorithm to identify matched molecular pairs (MMPs) in large datasets
J. Chem. Inf. Model.
, 339-348 (DOI:
2. S. D. Roughley and A. M. Jordan "
The Medicinal Chemist’s Toolbox: An Analysis of Reactions Used in the Pursuit of Drug Candidates
J. Med. Chem.
, 3451-3479 (DOI:
3. G. Landrum "
An Overview of RDKit
" (section entitled 'mmpa')