Icon

05. Flow Variable - solution

Flow Variables Solution

Solution for "Flow Variable" exercise for advanced Life Science User Training
- Filtering rows by an attribute value
- Filtering rows by an attribute value that fulfils a condition and that is updated automatically
- substructure search based on a drawn reference molecule


Activity II: Using Flow Variable for Substructure SearchThe goal of this exercise is to find all compounds from the dataset that match a certain substurcture and prepare thedataset within an interactive view including molecule highlighting. Write the found molecules into an Excel file, whichcontains the execution time in the file name. The RDKit Substructure Filter node expects a SMARTS to be entered to do the search. Use the Molecular Sketcher component and the Table Row to Variable node to pass the the chemical structure drawnin the Molecular Sketcher component to the RDKit Substructure Filter node as a variable Output. To do so, go to the FlowVariable Tab in the RDKit Substructure Filter node and select as smarts value your variable Output.Hint: KNIME-verified components can be found on the Hub or on the Example Server. Activity I: Flow Variables - Filter the compound data to 1. contain the compounds that were tested in the highest number of assays2. contain only compounds tested for AssayID "CHEMBL853187" Filtering by SelectionIn this exercise, we will use Flow Variables to overwrite node configurations. Step 1Use the Table Reader node toload theChEMBLID228_SERT_ligands.table Step 2.2Use the Value SelectionConfiguration node and set theDefault Column to "assay_chembl_id" Step 2.3Use the Row Filter node and connect theFlow Variable port from the ValueSelection Configuration node with theFlow Variable port of the Row Filter node. - Column to Test: "assay_chembl_id"- use pattern matching: "value-selection" Step 1.2Use the GroupBy node to group allthe assay for each compound.Groupby "molecule_chembl_id" andcount the "assay_chembl_id".Use the Sorter node to sort bycount of assays in a descendingorder. Step 1.3Use the Table Row toVariable node convert thefirst row of the table to FlowVariables Step 1.4Use the Row Filter nodewith Column to Test set to"molecule_chembl_id" andselect the Flow Variable"molecule_chembl_id" to useas pattern matching. Step 2Use the Table Row toVariable node convert thedrawn molecule to a FlowVariable Step 3Pass the Flow Variable"Output" to the"smarts_value" in theFlow Variable Tab Step 1Use the MolecularSketcher component todraw a referencemolecule for the similaritysearch in the interactiveview.You can also paste thefollowing sequence:CC(=O)N1CCNCC1 Step 2.4Create a Componentcontaining the ValueSelection Configurationnode and the Row Filternode Step 4Create a flow variable withthe execution date(Date&Time Configurationnode)Tip: Select "Date" for thesetting option type andactivate the checkbox"Use execution time". Step 5Create a Path flowvariable to save thefiltered product file in adirectory of your choice(Create File/FolderVariables node)Tip: select the baselocation, enter the fileextension and provide thefilename as a flowvariable. Step 6Write the filtered table intoa XLSX file (Excel Writernode)Tip: Use the Path flowvariable that you createdin 5 Action needed: Select some compounds!Add SMARTS to create a substructuresearchCHEMBLID228_SERT_ligands.tableCHEMBLID228_SERT_ligands.tableadd execution time to file name Molecular Sketcher GroupBy Sorter Row Filter RDKit Canon SMILES RDKit DescriptorCalculation Tile View Renderer to Image RDKit SubstructureFilter RDKit MoleculeHighlighting Assay Selection Table Rowto Variable Table Rowto Variable Table Reader Table Reader Date&TimeConfiguration String Manipulation(Variable) Create File/FolderVariables Excel Writer Column Filter Component Activity II: Using Flow Variable for Substructure SearchThe goal of this exercise is to find all compounds from the dataset that match a certain substurcture and prepare thedataset within an interactive view including molecule highlighting. Write the found molecules into an Excel file, whichcontains the execution time in the file name. The RDKit Substructure Filter node expects a SMARTS to be entered to do the search. Use the Molecular Sketcher component and the Table Row to Variable node to pass the the chemical structure drawnin the Molecular Sketcher component to the RDKit Substructure Filter node as a variable Output. To do so, go to the FlowVariable Tab in the RDKit Substructure Filter node and select as smarts value your variable Output.Hint: KNIME-verified components can be found on the Hub or on the Example Server. Activity I: Flow Variables - Filter the compound data to 1. contain the compounds that were tested in the highest number of assays2. contain only compounds tested for AssayID "CHEMBL853187" Filtering by SelectionIn this exercise, we will use Flow Variables to overwrite node configurations. Step 1Use the Table Reader node toload theChEMBLID228_SERT_ligands.table Step 2.2Use the Value SelectionConfiguration node and set theDefault Column to "assay_chembl_id" Step 2.3Use the Row Filter node and connect theFlow Variable port from the ValueSelection Configuration node with theFlow Variable port of the Row Filter node. - Column to Test: "assay_chembl_id"- use pattern matching: "value-selection" Step 1.2Use the GroupBy node to group allthe assay for each compound.Groupby "molecule_chembl_id" andcount the "assay_chembl_id".Use the Sorter node to sort bycount of assays in a descendingorder. Step 1.3Use the Table Row toVariable node convert thefirst row of the table to FlowVariables Step 1.4Use the Row Filter nodewith Column to Test set to"molecule_chembl_id" andselect the Flow Variable"molecule_chembl_id" to useas pattern matching. Step 2Use the Table Row toVariable node convert thedrawn molecule to a FlowVariable Step 3Pass the Flow Variable"Output" to the"smarts_value" in theFlow Variable Tab Step 1Use the MolecularSketcher component todraw a referencemolecule for the similaritysearch in the interactiveview.You can also paste thefollowing sequence:CC(=O)N1CCNCC1 Step 2.4Create a Componentcontaining the ValueSelection Configurationnode and the Row Filternode Step 4Create a flow variable withthe execution date(Date&Time Configurationnode)Tip: Select "Date" for thesetting option type andactivate the checkbox"Use execution time". Step 5Create a Path flowvariable to save thefiltered product file in adirectory of your choice(Create File/FolderVariables node)Tip: select the baselocation, enter the fileextension and provide thefilename as a flowvariable. Step 6Write the filtered table intoa XLSX file (Excel Writernode)Tip: Use the Path flowvariable that you createdin 5 Action needed: Select some compounds!Add SMARTS to create a substructuresearchCHEMBLID228_SERT_ligands.tableCHEMBLID228_SERT_ligands.tableadd execution time to file name Molecular Sketcher GroupBy Sorter Row Filter RDKit Canon SMILES RDKit DescriptorCalculation Tile View Renderer to Image RDKit SubstructureFilter RDKit MoleculeHighlighting Assay Selection Table Rowto Variable Table Rowto Variable Table Reader Table Reader Date&TimeConfiguration String Manipulation(Variable) Create File/FolderVariables Excel Writer Column Filter Component

Nodes

Extensions

Links