Icon

05_​Speedy_​SMILES_​ChEMBL_​Preprocessing_​Benchmarking

Speedy SMILES ChEMBL Preprocessing and Benchmarking

This workflow demonstrates the use of the Vernalis Speedy SMILES nodes to pre-process and filter a table of molecules from ChEMBL prior to subsequent processing in a chemical toolkit. The Vernalis benchmarking nodes are used to assess timing.

The processing is carried out within a component to enable streaming execution, which results in considerable timesavings. A Vernalis Benchmark Start/End loop surrounds the component to demonstrate the benchmarking functionality and also to allow the user to investigate the effect of using Streaming or non-streaming execution. The component contains the same nodes and annotations as those within the lower Benchmark loop which are not streamed

The top port of the component contains 'kept', desalted molecules, the bottom rejected molecules and the reason for rejection

The example contain 5000 compounds from ChEMBL21. The full set can be downloaded from the linked source (ftp://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_21/chembl_21_chemreps.txt.gz). Please note that the downloaded file has to be unzipped prior to reading in with the file reader node.

Execute the workflow to perform preprocessing on the ChEMBL21 database. In the uppersection. execution in streaming mode is also available Running in streaming mode De-salt Filter by charge properties Filter by atom count properties MissingSMILESNet Charge> 0Gross Charge> 4Ring Count > 08 <= HAC <= 45 C count > 1N+O > 1N+OShort local dataset(5000 compoundsfrom ChEMBL21)Count removal reasons Molecule Type Cast Row Splitter ConstantValue Column Speedy SMILESDe-salt Speedy SMILES RemoveBroken Bonds Splitter ConstantValue Column Speedy SMILESCharge Count ConstantValue Column Row Splitter ConstantValue Column Row Splitter Speedy SMILES HeavyAtom Count (HAC) Speedy SMILES ElementCount (C, N, O) Speedy SMILESRing Count Row Splitter ConstantValue Column Row Splitter ConstantValue Column Row Splitter ConstantValue Column Row Splitter ConstantValue Column Math Formula Concatenate Preprocess SMILES Benchmark Start Benchmark End(2 ports) Concatenate File Reader Value Counter Benchmark Start Benchmark End(2 ports) Concatenate Execute the workflow to perform preprocessing on the ChEMBL21 database. In the uppersection. execution in streaming mode is also available Running in streaming mode De-salt Filter by charge properties Filter by atom count properties MissingSMILESNet Charge> 0Gross Charge> 4Ring Count > 08 <= HAC <= 45 C count > 1N+O > 1N+OShort local dataset(5000 compoundsfrom ChEMBL21)Count removal reasonsMolecule Type Cast Row Splitter ConstantValue Column Speedy SMILESDe-salt Speedy SMILES RemoveBroken Bonds Splitter ConstantValue Column Speedy SMILESCharge Count ConstantValue Column Row Splitter ConstantValue Column Row Splitter Speedy SMILES HeavyAtom Count (HAC) Speedy SMILES ElementCount (C, N, O) Speedy SMILESRing Count Row Splitter ConstantValue Column Row Splitter ConstantValue Column Row Splitter ConstantValue Column Row Splitter ConstantValue Column Math Formula Concatenate Preprocess SMILES Benchmark Start Benchmark End(2 ports) Concatenate File Reader Value Counter Benchmark Start Benchmark End(2 ports) Concatenate

Nodes

Extensions

Links