Icon

Clustering_​and_​RGD

03_Clustering

This exercise shows how to perform hierarchical clustering based on molecule fingerprints and create an interactive view to pick interesting clusters.
Chemical structures are extracted from this publication: https://doi.org/10.1021/acs.jmedchem.9b01658​





Data Cluster Pick clusters and cores Enumerate compounds based on RGD Do RGD Node 14TanimotoAverage linkageNode 45find cluster MCSNode 51select one clusterto work withNode 66Node 69Node 70Node 72Generate aligned coordinatesNode 131Create at most 100K possiblesidechain combosRefine the coresusing code in a Jupyter notebookNode 227Node 237Node 238Node 239Node 240Add scaffoldattach the sidechainsusing code in a Jupyter notebookNode 243Uniquify on SMILESNode 246Node 247Node 248Node 249Node 250Remove rows we've already seenNode 254 RDKit Fingerprint Bit VectorDistances Hierarchical Clustering(DistMatrix) Set clusterthreshold Pick interestingclusters File Reader GroupBy Joiner Row Filter Pick final clusters Column Filter RDKit R-GroupDecomposition Row Filter Table Rowto Variable RDKit GenerateCoords Column Filter Python Script (1⇒1) Group and select Python Script (1⇒1) Joiner Column Filter Column Filter Row Filter Column Filter Cross Joiner Python Script (1⇒1) RDKit Canon SMILES DuplicateRow Filter RDKit Canon SMILES Deduplicate andadd images Remove smallclusters Highlightcluster cores Redo corehighlighting ReferenceRow Filter FOR DEMO: limitrow count Data Cluster Pick clusters and cores Enumerate compounds based on RGD Do RGD Node 14TanimotoAverage linkageNode 45find cluster MCSNode 51select one clusterto work withNode 66Node 69Node 70Node 72Generate aligned coordinatesNode 131Create at most 100K possiblesidechain combosRefine the coresusing code in a Jupyter notebookNode 227Node 237Node 238Node 239Node 240Add scaffoldattach the sidechainsusing code in a Jupyter notebookNode 243Uniquify on SMILESNode 246Node 247Node 248Node 249Node 250Remove rows we've already seenNode 254 RDKit Fingerprint Bit VectorDistances Hierarchical Clustering(DistMatrix) Set clusterthreshold Pick interestingclusters File Reader GroupBy Joiner Row Filter Pick final clusters Column Filter RDKit R-GroupDecomposition Row Filter Table Rowto Variable RDKit GenerateCoords Column Filter Python Script (1⇒1) Group and select Python Script (1⇒1) Joiner Column Filter Column Filter Row Filter Column Filter Cross Joiner Python Script (1⇒1) RDKit Canon SMILES DuplicateRow Filter RDKit Canon SMILES Deduplicate andadd images Remove smallclusters Highlightcluster cores Redo corehighlighting ReferenceRow Filter FOR DEMO: limitrow count

Nodes

Extensions

Links