MoSS

This node searches for frequent molecular fragments in a set of molecules. The algorithm used is Christian Borgelt's MoSS implementation.

Options

Input options

Column with molecule structure
Select the column in the input table that contains the molecule strings (Smiles, SLN or SDF).
Column with class labels
Select the column in the input table that contains the class labels (must be a string column). In the list below you can choose which values of the class column form the two groups (active or inactive) used during the search.

Mining options

Ignore pure carbon fragments
Prevents the search for fragments that only consists of carbon atoms.
Use ring mining
If enabled, ring of the specified sizes are treated as singles entities which makes the search much faster and also avoids finding fragments with partial rings.
Minimum focus support
Sets the minimum number of fragments - a fraction of the number of input molecules in the active class - a fragment must occur in in order to be frequent and thus reported.
Maximum complement support
Sets the maximum number of fragments - a fraction of the number of input molecules in the inactive class - a fragment may occur in in order to be reported.
Minimum fragment size
The minimum size (number of bonds) a fragment must have in order to be reported.
Maximum fragment size
The maximum size (number of bonds) a fragment may have in order to be reported.

MoSS options

Start with core
Provide a Smiles string that is used as a seed for starting the search. All found fragments will contain the core.
Canonical Form pruning
Turns canonical form pruning on or off.
Equivalent fragment pruning
Turns equivalent sibling pruning on or off.
Perfect extension pruning
Turns perfect extension pruning on or off.
Find chains of variable lengths
Enables the search for fragments that are equal except the length of their carbon chains.
Fragment extension mode
Choose between the two possible ways to extend a fragment during the search.
Maximum embeddings used
Sets the upper limit on the number of extension that are stored in memory

Input Ports

Icon
Data table with molecules

Output Ports

Icon
Data table with frequent fragments, sorted by support and size.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.