Icon

Data curation ''NIS''

Structures standardization and normalization
Cleaned datasets
Remove duplication
Salts-solvents-mixtures- Metallo-organics
All 4 datasets
Excel Reader
convert smile string to molecular struture
Molecule Type Cast
Remove solvents
RDKit Salt Stripper
Normalize the molecules
RDKit Structure Normalizer
Substructure Matcher
GroupBy
RDKit Remove Hs
Table Creator
negative dup with no filteration
Duplicate Row Filter
All datasets cleaned
Sorter
Column Resorter
Column Filter
Define solvents
Table Creator
String Manipulation
negative only
Row Filter
convert structure to RDKIT molecule
RDKit From Molecule
Table Creator
positive only
Row Filter
positive duplicates
Row Filter
all clean dataset (unique+active+inactive)
Concatenate
dupicated molecules
Row Filter
unique +clean active
Concatenate
Joiner
clean negative free from repetition
Reference Row Filter
All datasets excel
Excel Writer
unique molecules
Row Filter
clean positive free from repetition
Reference Row Filter
RDKit Canon SMILES
remove salts
RDKit Salt Stripper
Reference Row Filter
remove inorganic and metals
Substructure Matcher
positive dup Filtered
Duplicate Row Filter
Table Creator
negative duplicates
Row Filter
Valence Checker
remove + present in another dataset -
Reference Row Filter
RDKit From Molecule

Nodes

Extensions

Links