Icon

02. Data Manipulation - solution

Data Manipulation

Solution to "Manipulation Data" exercise for basic Life Science User Training
- Concatenate data from two different sources
- Modify String values
- Join data from multiple tables
- Remove duplicates in the data






- Use String Manipulation to ensure that allentries of the Sample column are usingUPPERCASE letters. Activity II: Data Manipulation & Aggregation - Join all data together by Sample name - Concatenate malariahts_experiment hitsand no-hits data into one single table - Join molecule properties (from SDF Reader)with other features (from Table Reader) by Sample name Activity III: Data Manipulation (Optional) - Use the Rule Engine node to add the following tags in a new column named REOS: -- "MW" if AMW is smaller than 100 or greater than 700 -- "Complexity" if NumHeavyAtoms is smaller than 5 or greater than 50 or NumRotatableBonds is greater or equal to 12 -- "HBond" if NumHBD is greater than 5 or NumHBA is greater than 10 -- "logP" if SlogP is smaller -5 or greater than 7.5 -- "Pass" for all other cases - Use the Table Manipulator node to: -- move the molecule column to the front (as first column of the table) -- remove the columns (Pf3D7_ps_green, Pf3D7_ps_red, ExactMW) -- rename the column AMW to "Average Molecular Weight (AWM) Activity I: Filtering- Remove rows where column Pf3D7_pEC50 contains missing values- Use Row Filter node to remove rows with values higher than 150 in column Pf3D7_ps_red - Remove column Pf3D7_ps_green from the result malariahts_experiment_hits.csvmalariahts_molecules_feature.tableRemove Pf3D7_ps_greenPf3D7_ps_red < 150REOS rulesexperiment dataUPPERCASEmalariahts_molecules.sdfmalariahts_experiment_no-hits.xlsxmalariahts_experiment_hits.csvMolecule DataMolecule Datamalariahts_joined.table CSV Reader Table Reader Missing Value Column Filter Row Filter Rule Engine Concatenate String Manipulation SDF Reader Excel Reader CSV Reader Table Manipulator Joiner Joiner Table Reader - Use String Manipulation to ensure that allentries of the Sample column are usingUPPERCASE letters. Activity II: Data Manipulation & Aggregation - Join all data together by Sample name - Concatenate malariahts_experiment hitsand no-hits data into one single table - Join molecule properties (from SDF Reader)with other features (from Table Reader) bySample name Activity III: Data Manipulation (Optional) - Use the Rule Engine node to add the following tags in a new column named REOS: -- "MW" if AMW is smaller than 100 or greater than 700 -- "Complexity" if NumHeavyAtoms is smaller than 5 or greater than 50 or NumRotatableBonds is greater or equal to 12 -- "HBond" if NumHBD is greater than 5 or NumHBA is greater than 10 -- "logP" if SlogP is smaller -5 or greater than 7.5 -- "Pass" for all other cases - Use the Table Manipulator node to: -- move the molecule column to the front (as first column of the table) -- remove the columns (Pf3D7_ps_green, Pf3D7_ps_red, ExactMW) -- rename the column AMW to "Average Molecular Weight (AWM) Activity I: Filtering- Remove rows where column Pf3D7_pEC50 contains missing values- Use Row Filter node to remove rows with values higher than 150 in column Pf3D7_ps_red - Remove column Pf3D7_ps_green from the result malariahts_experiment_hits.csvmalariahts_molecules_feature.tableRemove Pf3D7_ps_greenPf3D7_ps_red < 150REOS rulesexperiment dataUPPERCASEmalariahts_molecules.sdfmalariahts_experiment_no-hits.xlsxmalariahts_experiment_hits.csvMolecule DataMolecule Datamalariahts_joined.table CSV Reader Table Reader Missing Value Column Filter Row Filter Rule Engine Concatenate String Manipulation SDF Reader Excel Reader CSV Reader Table Manipulator Joiner Joiner Table Reader

Nodes

Extensions

Links