Icon

20220718 Pikairos How to Optimize Parallelized Substructure Filtering

20220718 Pikairos How to Optimize RDKit Parallelized Substructure Filtering
Problem B)bit smaller but still long list, now few hundred queries to loop through additionally Problem A)very long list, one query Problem B) this time handled using the "RDKIT molecule substructure filter" nodebit smaller but still long list, now few hundred queries to loop through additionally Your MoleculesYour Queriesnumber of rowsto be createdfrom each rowYour Queriesnumber of rowsto be createdfrom each rownumber of rowsto be createdfrom each rowYour MoleculesQueries LoopLoop StartQueries LoopLoop EndParallelism is handled hereand should not behandled / duplicatedelsewhere to avoidany parallelism conflictSerial Chunk LoopGeneric Loop Endto CloseSerial Chunk Loop StartJust to havean idea of CPU timewhen running ona PC with 6 Cores& 128 GigaBytesParallelism is handled hereand should not behandled / Duplicatedelsewhere to avoidany parallelism conflictSerial Chunk LoopGeneric Loop Endto CloseSerial Chunk Loop StartGeneric Loop Endto CloseSerial Chunk Loop StartYour QueriesYour MoleculesSerial Chunk Loopnumber of rowsto be createdfrom each rownumber of rowsto be createdfrom each rowParallelism is handled hereand should not behandled / Duplicatedelsewhere to avoidany parallelism conflictTable Creator Table Creator Molecule Type Cast Molecule Type Cast ConstantValue Column One Row to Many Table Rowto Variable Molecule Type Cast Molecule Type Cast Table Creator Molecule Type Cast ConstantValue Column One Row to Many ConstantValue Column One Row to Many Table Creator Molecule Type Cast Table Row ToVariable Loop Start Loop End RDKit SubstructureFilter Chunk Loop Start Loop End Timer Info RDKit SubstructureFilter Chunk Loop Start Loop End Loop End Table Creator Molecule Type Cast Table Creator Chunk Loop Start Molecule Type Cast ConstantValue Column ConstantValue Column One Row to Many Molecule Type Cast One Row to Many RDKit MoleculeSubstructure Filter Problem B)bit smaller but still long list, now few hundred queries to loop through additionally Problem A)very long list, one query Problem B) this time handled using the "RDKIT molecule substructure filter" nodebit smaller but still long list, now few hundred queries to loop through additionally Your MoleculesYour Queriesnumber of rowsto be createdfrom each rowYour Queriesnumber of rowsto be createdfrom each rownumber of rowsto be createdfrom each rowYour MoleculesQueries LoopLoop StartQueries LoopLoop EndParallelism is handled hereand should not behandled / duplicatedelsewhere to avoidany parallelism conflictSerial Chunk LoopGeneric Loop Endto CloseSerial Chunk Loop StartJust to havean idea of CPU timewhen running ona PC with 6 Cores& 128 GigaBytesParallelism is handled hereand should not behandled / Duplicatedelsewhere to avoidany parallelism conflictSerial Chunk LoopGeneric Loop Endto CloseSerial Chunk Loop StartGeneric Loop Endto CloseSerial Chunk Loop StartYour QueriesYour MoleculesSerial Chunk Loopnumber of rowsto be createdfrom each rownumber of rowsto be createdfrom each rowParallelism is handled hereand should not behandled / Duplicatedelsewhere to avoidany parallelism conflictTable Creator Table Creator Molecule Type Cast Molecule Type Cast ConstantValue Column One Row to Many Table Rowto Variable Molecule Type Cast Molecule Type Cast Table Creator Molecule Type Cast ConstantValue Column One Row to Many ConstantValue Column One Row to Many Table Creator Molecule Type Cast Table Row ToVariable Loop Start Loop End RDKit SubstructureFilter Chunk Loop Start Loop End Timer Info RDKit SubstructureFilter Chunk Loop Start Loop End Loop End Table Creator Molecule Type Cast Table Creator Chunk Loop Start Molecule Type Cast ConstantValue Column ConstantValue Column One Row to Many Molecule Type Cast One Row to Many RDKit MoleculeSubstructure Filter

Nodes

Extensions

Links