Activity Miner

Activity Miner is a tool for generating a similarity or disparity matrix from a set of aligned molecules. The similarity matrix can be generated using either Cresset's 3D molecular fields or 2D fingerprints.

The disparity between a pair of molecules is calculated as the difference in their activity divided by the distance between them. In Activity Miner the distance between a pair of molecules is calculated from their 3D or 2D similarity.

The molecules must be pre-aligned if you are using Cresset's molecular fields - the falign program or Forge Align node are ideal for this.

This node wraps the Forge Activity Miner executable 'aminer', which must be installed with a valid license for this node to work. If this is installed in the default location on Windows, then it should be found automatically. Otherwise, you must either set the 'Cresset Home' preference setting or the CRESSET_HOME environment variable to the base Cresset software install directory. You may also set the 'aminer Path' preference setting or the CRESSET_ACTIVITYMINER_EXE environment variable to point directly at the executable itself.

The Activity Miner node can be configured to use additional resources to perform calculations. The time taken for the node to run will be drastically reduced using the Cresset's Engine Broker. To use this facility either set the 'Cresset Engine Broker' preference or the CRESSET_BROKER environment variable to point to the location of your local Engine Broker. If you do not currently have the Cresset Engine Broker then contact Cresset (enquiries@cresset-group.com) for pricing on local and cloud based brokers.

For more information visit www.cresset-group.com or contact us at support@cresset-group.com.

Options

Basic

Column containing the aligned molecules for Activity Miner
The column in the input datatable which contains the molecules to use when generating the similarity or disparity matrix.
Similarity matrix method
The method used to calculate the similarity between the molecules.
  • field - Cresset's 3D field/shape similarity, molecules must be pre-aligned
  • ECFP4 - 2D similarity based on Extended-Connectivity Fingerprints with a radius of 2
  • ECFP6 - 2D similarity based on Extended-Connectivity Fingerprints with a radius of 3
  • FCFP4 - 2D similarity based on Circular Pharmacophore Fingerprints with a radius of 2
  • FCFP6 - 2D similarity based on Circular Pharmacophore Fingerprints with a radius of 3
Matrix Type
Defines the type of matrix that should be generated.
  • Disparity Matrix - The matrix contains the disparity values for each pair of input molecules. The disparity values are calculated by dividing the difference in activity between two molecules by the distance between them. The distance is calculated as '1 – similarity', using the similarity metric set by the 'Similarity matrix method' option. Calculating the disparity matrix requires the input molecules to have activity values.
  • Distance Matrix - The matrix contains the distance between the input molecules. The distance is calculated as '1 – similarity', using the similarity metric set by the 'Similarity matrix method' option'.
  • Similarity Matrix - The matrix contains the similarity values for each pair input molecules using the method set by the 'Similarity matrix method' option.
Activity column
The name of the column which contains the activity data to use when calculating the disparity.
Units for the input activity values
Specify whether the input activity values require transforming and give their units.
Average Error of Activity
Sets the average error of activity for the dataset, in log units. For example, specifying 0.3 means that activity differences of less than 0.3 units are not significant.
Assign formal charges to input molecules
If checked, the protonation states for the input molecules are set using Cresset's charging rules. Acids will be deprotonated, primary amines protonated, etc.
Shape weight
The relative weight assigned to shape (as opposed to field) similarity. Values must be between 0.0 (all field) and 1.0 (all shape).

Input Ports

Icon
A set of molecules. The molecules must be pre-aligned to use Cresset's field similarity - the falign program or the Forge Align component are ideal for this. You do not need pre-align molecules if you are using a 2D similarity metric. To calculate the disparity matrix, the input molecules must have activity data.

Output Ports

Icon
The input table with an additional column called "Distance Matrix" which contains the similarity or disparity matrix for the input molecules.
Icon
The Forge project containing the matrix. The 'Forge Project Viewer' node may be used to view the matrix using the Activity Miner user interface.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.