Icon

02. Workflow Control - solution

Workflow Control Solution

Solution for "Workflow Control" exercise for advanced Life Science User Training - Handle groups of data in separate iterations where the groups are defined by values in one or more columns - Read and concatenate many files - Activate a workflow branch based on a user selection

Activity I: Group Looping - Read the file: CHEMBLID228_SERT_ligands.table - Group over all the AssayIDs with a group loop - For each group, write a new KNIME table to your KNIME Workspace. For that, create an appropriate file name in the String manipulation (Variable) node and use this as input for theCreate File Name node. (Hint: the Group loop start creates a variable for each group). Use the created name in the Table Writer. Activity II: Reading Many Files - Use List files to find the file names of the files created in exercise 1 - Iterate over that list of files with a Table Row to Variable Loop, and aggregate them into a single KNIME table Activity III: If Switch Extend the workflow below with a switch to select the kind of visualisation. - Use a Single Selection Configuration node to let a user choose the values "Parallel Coordinates Plot" or "Scatter Plot", use the flow variable "port index" as output port - Use a CASE Switch Data (Start) to create either a parallel coordinate plot or a scatter plot depending on the input. - Use Column Rename node to rename the plot selection column in both branches "Selection" - Combine the two paths with a CASE Switch Data (End) and use a Row Filter node to filter only the selected compounds Step 1Use the Table Reader node to load theChEMBLID228_SERT_ligands.table Step 2Use the Group Loop Startnode to group and iterate overeach assay_chmebl_id Step 3Use the String Manipulation (Variable) nodeto create a file name consisting of "Assay_"and the assay_chembl_id.The Create File Name node creates individualfile_paths for each AssayID table Step 4Use the Table Writer node to write eachAssayID data table to a separate file. Connectthe Flow Variable from the Create File Pathnode and select as output location the FlowVariabel "filePath".Use the Variable Loop End node to end theloop. Step 1Use the List Files node to get alist of all files in the folder "data/temp/". Step 2Use the Table Row to VariableLoop Start node to iterate over allfiles in file list Step 3Use the Table Reader node andconnect the Flow Variable portfrom the Table Row to VariableLoop Start. As input location selectthe Flow Variable "URL". Step 4Use the Loop End node to endthe loop. Pay attention: thecombined data from all loops isavailable in the output port of thenode. Step 1Use the CASE Switch Data(Start) node and control thePortIndex by the Flow Variable"plot-type (index)" from theSingle Selection Configurationnode. Step 2Use the CASE SwitchData (End) node toend the switch case. Generate the nameof the file to writeCollect variableand end loopWrite one file to diskdo one iterationper AssayIDList all filesin the directoryLoop overfile names(1 per iteration)Combine read fileinto single tableRead fileCombineinto oneCHEMBLID228_SERT_ligands.tableCHEMBLID228_SERT_ligands.table Create File Name Variable Loop End Table Writer Group Loop Start List Files Table Row ToVariable Loop Start Loop End Table Reader CASE SwitchData (Start) CASE SwitchData (End) String Manipulation(Variable) RDKit DescriptorCalculation Table Reader Table Reader ParallelCoordinates Plot Scatter Plot Single SelectionConfiguration Column Rename Column Rename Row Filter Activity I: Group Looping - Read the file: CHEMBLID228_SERT_ligands.table - Group over all the AssayIDs with a group loop - For each group, write a new KNIME table to your KNIME Workspace. For that, create an appropriate file name in the String manipulation (Variable) node and use this as input for theCreate File Name node. (Hint: the Group loop start creates a variable for each group). Use the created name in the Table Writer. Activity II: Reading Many Files - Use List files to find the file names of the files created in exercise 1 - Iterate over that list of files with a Table Row to Variable Loop, and aggregate them into a single KNIME table Activity III: If Switch Extend the workflow below with a switch to select the kind of visualisation. - Use a Single Selection Configuration node to let a user choose the values "Parallel Coordinates Plot" or "Scatter Plot", use the flow variable "port index" as output port - Use a CASE Switch Data (Start) to create either a parallel coordinate plot or a scatter plot depending on the input. - Use Column Rename node to rename the plot selection column in both branches "Selection" - Combine the two paths with a CASE Switch Data (End) and use a Row Filter node to filter only the selected compounds Step 1Use the Table Reader node to load theChEMBLID228_SERT_ligands.table Step 2Use the Group Loop Startnode to group and iterate overeach assay_chmebl_id Step 3Use the String Manipulation (Variable) nodeto create a file name consisting of "Assay_"and the assay_chembl_id.The Create File Name node creates individualfile_paths for each AssayID table Step 4Use the Table Writer node to write eachAssayID data table to a separate file. Connectthe Flow Variable from the Create File Pathnode and select as output location the FlowVariabel "filePath".Use the Variable Loop End node to end theloop. Step 1Use the List Files node to get alist of all files in the folder "data/temp/". Step 2Use the Table Row to VariableLoop Start node to iterate over allfiles in file list Step 3Use the Table Reader node andconnect the Flow Variable portfrom the Table Row to VariableLoop Start. As input location selectthe Flow Variable "URL". Step 4Use the Loop End node to endthe loop. Pay attention: thecombined data from all loops isavailable in the output port of thenode. Step 1Use the CASE Switch Data(Start) node and control thePortIndex by the Flow Variable"plot-type (index)" from theSingle Selection Configurationnode. Step 2Use the CASE SwitchData (End) node toend the switch case. Generate the nameof the file to writeCollect variableand end loopWrite one file to diskdo one iterationper AssayIDList all filesin the directoryLoop overfile names(1 per iteration)Combine read fileinto single tableRead fileCombineinto oneCHEMBLID228_SERT_ligands.tableCHEMBLID228_SERT_ligands.table Create File Name Variable Loop End Table Writer Group Loop Start List Files Table Row ToVariable Loop Start Loop End Table Reader CASE SwitchData (Start) CASE SwitchData (End) String Manipulation(Variable) RDKit DescriptorCalculation Table Reader Table Reader ParallelCoordinates Plot Scatter Plot Single SelectionConfiguration Column Rename Column Rename Row Filter

Nodes

Extensions

Links