Icon

01_​Data Access and Transformation

01_Data Access and TransformationThis workflow fetches data from a database and does some preprocessing so that the data can be used for machine learning in the next exercise. This includes the creation of the target column("activity") containing the classification and the generation of 5 different types of fingerprints. Step 1. Connect to a database Step 2. Preprocess the data in the database Step 4. Transform the string into chemical structure format Step 3. Read thedata into KNIMEtable Step 5. Calculate the chemical fingerprints Step 6. Save remove the target from the column names resulting from the pivotingassay datacompound datapivot the data to have assay data in columns and compounds in rowsadd the SMILES and compound ChEMBL IDsusing a Joinerread the data infilter for the target (serotonin 1a receptor)filter out molregnoECFP6create activity column (target column for machine learning)ECFP4AtomPairRDKitECFC6 Molecule Type Cast Renderer to Image Column Rename(Regex) DB Table Selector DB Table Selector DB Pivot DB Joiner DB Reader String To Number Column Resorter DB Row Filter DB Column Filter RDKit Fingerprint Rule Engine RDKit Fingerprint RDKit Fingerprint RDKit Fingerprint RDKit Count-BasedFingerprint Table Writer SQLite Connector RDKit Canon SMILES RDKit Salt Stripper DuplicateRow Filter 01_Data Access and TransformationThis workflow fetches data from a database and does some preprocessing so that the data can be used for machine learning in the next exercise. This includes the creation of the target column("activity") containing the classification and the generation of 5 different types of fingerprints. Step 1. Connect to a database Step 2. Preprocess the data in the database Step 4. Transform the string into chemical structure format Step 3. Read thedata into KNIMEtable Step 5. Calculate the chemical fingerprints Step 6. Save remove the target from the column names resulting from the pivotingassay datacompound datapivot the data to have assay data in columns and compounds in rowsadd the SMILES and compound ChEMBL IDsusing a Joinerread the data infilter for the target (serotonin 1a receptor)filter out molregnoECFP6create activity column (target column for machine learning)ECFP4AtomPairRDKitECFC6 Molecule Type Cast Renderer to Image Column Rename(Regex) DB Table Selector DB Table Selector DB Pivot DB Joiner DB Reader String To Number Column Resorter DB Row Filter DB Column Filter RDKit Fingerprint Rule Engine RDKit Fingerprint RDKit Fingerprint RDKit Fingerprint RDKit Count-BasedFingerprint Table Writer SQLite Connector RDKit Canon SMILES RDKit Salt Stripper DuplicateRow Filter

Nodes

Extensions

Links