Icon

02. Data Manipulation

Data Manipulation

"Data Manipulation" exercise for basic Life Science User Training
- Concatenate data from two different sources
- Modify String values
- Join data from multiple tables
- Remove duplicates in the data






Activity I: Filtering- Remove rows where column Pf3D7_pEC50 contains missing values- Use Row Filter node to remove rows with values higher than 150 in column Pf3D7_ps_red - Remove column Pf3D7_ps_green from the result - Use String Manipulation to ensurethat all entries of the Samplecolumn are using UPPERCASEletters. Activity II: Data Manipulation & Aggregation - Join all data together by Sample name - Concatenatemalariahts_experiment hits andno-hits data into one single table - Join molecule properties (fromSDF Reader) with other features (from Table Reader) by Samplename Activity III: Data Manipulation (Optional) - Use the Rule Engine node to add the following tags in a new column named REOS: -- "MW" if AMW is smaller than 100 or greater than 700 -- "Complexity" if NumHeavyAtoms is smaller than 5 or greater than 50 or NumRotatableBonds is greater or equal to 12 -- "HBond" if NumHBD is greater than 5 or NumHBA is greater than 10 -- "logP" if SlogP is smaller -5 or greater than 7.5 -- "Pass" for all other cases - Use the Table Manipulator node to: -- move the molecule column to the front (as first column of the table) -- remove the columns (Pf3D7_ps_green, Pf3D7_ps_red, ExactMW) -- rename the column AMW to "Average Molecular Weight (AWM) malariahts_experiment_hits.csv malariahts_molecules.sdfmalariahts_experiment_no-hits.xlsxmalariahts_experiment_hits.csvmalariahts_joined.table Node 282Node 283Node 284Node 285Node 286Node 287Node 288Node 289Node 290Node 291Node 292Node 293CSV Reader SDF Reader Excel Reader CSV Reader Table Reader Row Filter Missing Value Column Filter Concatenate String Manipulation Joiner Table Reader Joiner Table Reader Rule Engine Table Manipulator Value Counter Activity I: Filtering- Remove rows where column Pf3D7_pEC50 contains missing values- Use Row Filter node to remove rows with values higher than 150 in column Pf3D7_ps_red - Remove column Pf3D7_ps_green from the result - Use String Manipulation to ensurethat all entries of the Samplecolumn are using UPPERCASEletters. Activity II: Data Manipulation & Aggregation - Join all data together by Sample name - Concatenatemalariahts_experiment hits andno-hits data into one single table - Join molecule properties (fromSDF Reader) with other features (from Table Reader) by Samplename Activity III: Data Manipulation (Optional) - Use the Rule Engine node to add the following tags in a new column named REOS: -- "MW" if AMW is smaller than 100 or greater than 700 -- "Complexity" if NumHeavyAtoms is smaller than 5 or greater than 50 or NumRotatableBonds is greater or equal to 12 -- "HBond" if NumHBD is greater than 5 or NumHBA is greater than 10 -- "logP" if SlogP is smaller -5 or greater than 7.5 -- "Pass" for all other cases - Use the Table Manipulator node to: -- move the molecule column to the front (as first column of the table) -- remove the columns (Pf3D7_ps_green, Pf3D7_ps_red, ExactMW) -- rename the column AMW to "Average Molecular Weight (AWM) malariahts_experiment_hits.csv malariahts_molecules.sdfmalariahts_experiment_no-hits.xlsxmalariahts_experiment_hits.csvmalariahts_joined.table Node 282Node 283Node 284Node 285Node 286Node 287Node 288Node 289Node 290Node 291Node 292Node 293CSV Reader SDF Reader Excel Reader CSV Reader Table Reader Row Filter Missing Value Column Filter Concatenate String Manipulation Joiner Table Reader Joiner Table Reader Rule Engine Table Manipulator Value Counter

Nodes

Extensions

Links