0 ×

05_​Speedy_​SMILES_​ChEMBL_​Preprocessing_​Benchmarking

Workflow

Speedy SMILES ChEMBL Preprocessing and Benchmarking
This workflow demonstrates the use of a variety of the Vernalis Speedy SMILES nodes to preprocess the ChEMBL21 database, and the Vernalis benchmarking nodes to assess timing
Vernalis Speedy SMILES Benchmarking Streaming mode SMILES ChEMBL
Vernalis "Speedy SMILES" nodes for fast toolkit-independant molecule preprocessingThis workflow demonstrates the use of the Vernalis Speedy SMILES nodes to pre-process and filter a table of moleculesfrom ChEMBL prior to subsequent processing in a chemical toolkit.The processing is carried out within a wrapped metanode to enable streaming execution, which results in considerabletimesavings. A Vernalis Benchmark Start/End loop surrounds the wrapped metanode to demonstrate the benchmarkingfunctionality and also to allow the user to investigate the effect of using Streaming or non-streaming execution. Themetanode contains the same nodes and annotations as those within the lower Benchmark loop which are not streamedThe top port of the wrapped metanode contains 'kept', desalted molecules, the bottom rejected molecules and the reasonfor rejectionThe example contain 5000 compounds from ChEMBL21. The full set can be downloaded from ftp://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_21/chembl_21_chemreps.txt.gz, and unzipped prior to reading in withthe file reader node De-salt Filter by charge properties Filter by atom count properties RUNNING IN STREAMING MODENode 6MissingSMILESNode 8Node 9Node 10Node 11Node 12Node 14Net Charge> 0Node 17Gross Charge> 4Node 19Node 20Node 21Ring Count > 0Node 238 <= HAC <= 45Node 25 C count > 1Node 27N+O > 1Node 29N+ONode 31Node 32Node 33Node 36Node 37Short local dataset(5000 compounds from ChEMBL21)Count removal reasonsNode 44Node 46 Molecule Type Cast Row Splitter ConstantValue Column Speedy SMILESDe-salt Speedy SMILES RemoveBroken Bonds Splitter ConstantValue Column Speedy SMILESCharge Count ConstantValue Column Row Splitter ConstantValue Column Row Splitter Speedy SMILES HeavyAtom Count (HAC) Speedy SMILES ElementCount (C, N, O) Speedy SMILESRing Count Row Splitter ConstantValue Column Row Splitter ConstantValue Column Row Splitter ConstantValue Column Row Splitter ConstantValue Column Math Formula Concatenate(Optional in) Concatenate(Optional in) Concatenate Preprocess SMILES Benchmark Start Benchmark End(2 ports) File Reader Value Counter Benchmark Start Benchmark End(2 ports) Vernalis "Speedy SMILES" nodes for fast toolkit-independant molecule preprocessingThis workflow demonstrates the use of the Vernalis Speedy SMILES nodes to pre-process and filter a table of moleculesfrom ChEMBL prior to subsequent processing in a chemical toolkit.The processing is carried out within a wrapped metanode to enable streaming execution, which results in considerabletimesavings. A Vernalis Benchmark Start/End loop surrounds the wrapped metanode to demonstrate the benchmarkingfunctionality and also to allow the user to investigate the effect of using Streaming or non-streaming execution. Themetanode contains the same nodes and annotations as those within the lower Benchmark loop which are not streamedThe top port of the wrapped metanode contains 'kept', desalted molecules, the bottom rejected molecules and the reasonfor rejectionThe example contain 5000 compounds from ChEMBL21. The full set can be downloaded from ftp://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_21/chembl_21_chemreps.txt.gz, and unzipped prior to reading in withthe file reader node De-salt Filter by charge properties Filter by atom count properties RUNNING IN STREAMING MODENode 6MissingSMILESNode 8Node 9Node 10Node 11Node 12Node 14Net Charge> 0Node 17Gross Charge> 4Node 19Node 20Node 21Ring Count > 0Node 238 <= HAC <= 45Node 25 C count > 1Node 27N+O > 1Node 29N+ONode 31Node 32Node 33Node 36Node 37Short local dataset(5000 compounds from ChEMBL21)Count removal reasonsNode 44Node 46 Molecule Type Cast Row Splitter ConstantValue Column Speedy SMILESDe-salt Speedy SMILES RemoveBroken Bonds Splitter ConstantValue Column Speedy SMILESCharge Count ConstantValue Column Row Splitter ConstantValue Column Row Splitter Speedy SMILES HeavyAtom Count (HAC) Speedy SMILES ElementCount (C, N, O) Speedy SMILESRing Count Row Splitter ConstantValue Column Row Splitter ConstantValue Column Row Splitter ConstantValue Column Row Splitter ConstantValue Column Math Formula Concatenate(Optional in) Concatenate(Optional in) Concatenate Preprocess SMILES Benchmark Start Benchmark End(2 ports) File Reader Value Counter Benchmark Start Benchmark End(2 ports)

Download

Get this workflow from the following link: Download

Nodes

05_​Speedy_​SMILES_​ChEMBL_​Preprocessing_​Benchmarking consists of the following 60 nodes(s):

Plugins

05_​Speedy_​SMILES_​ChEMBL_​Preprocessing_​Benchmarking contains nodes provided by the following 5 plugin(s):