Icon

02. Data Manipulation - solution

Data Manipulation

Solution to "Manipulation Data" exercise for basic Life Science User Training
- Concatenate data from two different sources
- Modify String values
- Join data from multiple tables
- Remove duplicates in the data


- Use String Manipulation to ensure that allentries of the Sample column are usingUPPERCASE letters. Activity II: Data Manipulation & Aggregation - Join all data together by Sample name - Concatenate malariahts_experiment hitsand no-hits data into one single table - Join molecule properties (from SDF Reader)with other features (from Table Reader) by Sample name - Using the Table Writer node Write the joined data to a KNIMEtable file with a named malariahts_joined.tableHint: use a relative file path: knime://knime.workflow/../../data/malariahts_joined.table for output location Activity III: Data Manipulation (Optional) - Use the Rule Engine node to add the following tags in a new column named REOS: -- "MW" if AMW is smaller than 100 or greater than 700 -- "Complexity" if NumHeavyAtoms is smaller than 5 or greater than 50 or NumRotatableBonds is greater or equal to 12 -- "HBond" if NumHBD is greater than 5 or NumHBA is greater than 10 -- "logP" if SlogP is smaller -5 or greater than 7.5 -- "Pass" for all other cases - Keep only the following columns: Sample, Pf3D7_ps_hit, AMW, NumRotatableBonds, NumHBD, NumHBA, NumHeavyAtoms, FractionCSP3, MFP2, REOS - Write the results to a file using the Table Writer node Hint: use a relative file path: knime://knime.workflow/../../data/malariahts_joined_REOS.table for output location Activity I: Filtering- Remove rows where column Pf3D7_pEC50 contains missing values- Use Row Filter node to keep rows with values higher than 150 in column Pf3D7_ps_red - Remove column Pf3D7_ps_green from the result Remove Pf3D7_ps_greenPf3D7_ps_red < 150REOS rulesexperiment dataUPPERCASEMolecule Datamalariahts_experiment_no-hits.xlsxmalariahts_experiment_hits.csvmalariahts_molecules_feature.sdfmalariahts_experiment_hits.csvmalariahts_molecules.sdfmalariahts_joined.table Missing Value Column Filter Row Filter Rule Engine Concatenate Joiner String Manipulation Joiner Excel Reader (XLS) File Reader Table Reader File Reader SDF Reader Table Reader Table Writer Table Writer Column Filter - Use String Manipulation to ensure that allentries of the Sample column are usingUPPERCASE letters. Activity II: Data Manipulation & Aggregation - Join all data together by Sample name - Concatenate malariahts_experiment hitsand no-hits data into one single table - Join molecule properties (from SDF Reader)with other features (from Table Reader) bySample name - Using the Table Writer node Write the joined data to a KNIMEtable file with a named malariahts_joined.tableHint: use a relative file path: knime://knime.workflow/../../data/malariahts_joined.table for output location Activity III: Data Manipulation (Optional) - Use the Rule Engine node to add the following tags in a new column named REOS: -- "MW" if AMW is smaller than 100 or greater than 700 -- "Complexity" if NumHeavyAtoms is smaller than 5 or greater than 50 or NumRotatableBonds is greater or equal to 12 -- "HBond" if NumHBD is greater than 5 or NumHBA is greater than 10 -- "logP" if SlogP is smaller -5 or greater than 7.5 -- "Pass" for all other cases - Keep only the following columns: Sample, Pf3D7_ps_hit, AMW, NumRotatableBonds, NumHBD, NumHBA, NumHeavyAtoms, FractionCSP3, MFP2, REOS - Write the results to a file using the Table Writer node Hint: use a relative file path: knime://knime.workflow/../../data/malariahts_joined_REOS.table for output location Activity I: Filtering- Remove rows where column Pf3D7_pEC50 contains missing values- Use Row Filter node to keep rows with values higher than 150 in column Pf3D7_ps_red - Remove column Pf3D7_ps_green from the result Remove Pf3D7_ps_greenPf3D7_ps_red < 150REOS rulesexperiment dataUPPERCASEMolecule Datamalariahts_experiment_no-hits.xlsxmalariahts_experiment_hits.csvmalariahts_molecules_feature.sdfmalariahts_experiment_hits.csvmalariahts_molecules.sdfmalariahts_joined.table Missing Value Column Filter Row Filter Rule Engine Concatenate Joiner String Manipulation Joiner Excel Reader (XLS) File Reader Table Reader File Reader SDF Reader Table Reader Table Writer Table Writer Column Filter

Nodes

Extensions

Links