Icon

02. Data Manipulation

Data Manipulation

"Data Manipulation" exercise for basic Life Science User Training
- Concatenate data from two different sources
- Modify String values
- Join data from multiple tables
- Remove duplicates in the data






Activity I: Filtering- Remove rows where column Pf3D7_pEC50 contains missing values- Use Row Filter node to remove rows with values higher than 150 in column Pf3D7_ps_red - Remove column Pf3D7_ps_green from the result - Use String Manipulation to ensure that allentries of the Sample column are usingUPPERCASE letters. Activity II: Data Manipulation & Aggregation - Join all data together by Sample name - Concatenate malariahts_experiment hitsand no-hits data into one single table - Join molecule properties (from SDF Reader)with other features (from Table Reader) by Sample name Activity III: Data Manipulation (Optional) - Use the Rule Engine node to add the following tags in a new column named REOS: -- "MW" if AMW is smaller than 100 or greater than 700 -- "Complexity" if NumHeavyAtoms is smaller than 5 or greater than 50 or NumRotatableBonds is greater or equal to 12 -- "HBond" if NumHBD is greater than 5 or NumHBA is greater than 10 -- "logP" if SlogP is smaller -5 or greater than 7.5 -- "Pass" for all other cases - Use the Table Manipulator node to: -- move the molecule column to the front (as first column of the table) -- remove the columns (Pf3D7_ps_green, Pf3D7_ps_red, ExactMW) -- rename the column AMW to "Average Molecular Weight (AWM) malariahts_experiment_hits.csvmalariahts_molecules_feature.tablemalariahts_molecules.sdfmalariahts_experiment_no-hits.xlsxmalariahts_experiment_hits.csvmalariahts_joined.table CSV Reader Table Reader SDF Reader Excel Reader CSV Reader Table Reader Activity I: Filtering- Remove rows where column Pf3D7_pEC50 contains missing values- Use Row Filter node to remove rows with values higher than 150 in column Pf3D7_ps_red - Remove column Pf3D7_ps_green from the result - Use String Manipulation to ensure that allentries of the Sample column are usingUPPERCASE letters. Activity II: Data Manipulation & Aggregation - Join all data together by Sample name - Concatenate malariahts_experiment hitsand no-hits data into one single table - Join molecule properties (from SDF Reader)with other features (from Table Reader) by Sample name Activity III: Data Manipulation (Optional) - Use the Rule Engine node to add the following tags in a new column named REOS: -- "MW" if AMW is smaller than 100 or greater than 700 -- "Complexity" if NumHeavyAtoms is smaller than 5 or greater than 50 or NumRotatableBonds is greater or equal to 12 -- "HBond" if NumHBD is greater than 5 or NumHBA is greater than 10 -- "logP" if SlogP is smaller -5 or greater than 7.5 -- "Pass" for all other cases - Use the Table Manipulator node to: -- move the molecule column to the front (as first column of the table) -- remove the columns (Pf3D7_ps_green, Pf3D7_ps_red, ExactMW) -- rename the column AMW to "Average Molecular Weight (AWM) malariahts_experiment_hits.csvmalariahts_molecules_feature.tablemalariahts_molecules.sdfmalariahts_experiment_no-hits.xlsxmalariahts_experiment_hits.csvmalariahts_joined.table CSV Reader Table Reader SDF Reader Excel Reader CSV Reader Table Reader

Nodes

Extensions

Links