Icon

Sessions1-3

Session I: Apply Flow Variables and Path VariablesSession II: Control the Workflow with a Loop1. Find the store with the most sale:- Work with an output table of the Processed Customer Data metanode - a fully joined customer transaction data.- Use Row Aggregator node to find the total sale for each store.- Sort the total price in descending order with Sorter node.- Convert the first row of the table to a variable with Table Row to Variable node and use it to filter the most successful store in Row Filter.- Create a metanode Store with Highest Total Sale out of all nodes from above.2. Build a loop that finds the store with the highest total sale in each city and writes the records from that store to a separate file:- Before Store with Highest Total Sale metanode, insert Group Loop Start node that passes on the records from each city in each iteration.- Extract the current execution time with Date&Time Configuration node*. Pass it on as a flow variable Execution_time.- With String Manipulation (Variable) node, combine the city name (flow variable City) with the execution date (flow variable Execution_time). Replaceany colon (":") with a hyphen ("-").** The resulting string is stored as a flow variable File_name.- For each city, create a path variable (Create File/Folder Variables). The location is the folder temp located under the data folder. For the filename,under the Flow Variables tab, set path_value to File_name, the flow variable from the previous step. The File extension is xlsx. Pass on the resultingpath variable as output-location.- Write the output from Store with Highest Total Sale metanode to an Excel file with Excel Writer node. Make a flow variable connection betweenCreate File/Folder Variables node to Excel Writer node. The output location is specified by the path variable output-location.- Close and execute the loop by outputting flow variables from Excel Writer node to Variable Loop End node.*Connect the output from the Group Loop Start node (contains flow variable City) to the Date&Time Configuration node (contains current executiontime information)**Use the expression join($${SCity}$$,replaceChars($${SExecution_date}$$,":" ,"-" )) inside String Manipulation (Variable) node. L2 Demo Workflow for Sessions 1-3Customer Transaction Analysis Learning objectives: - Control node settings with flow variables- Repeat the same task with a loop- Execute different branches of a workflow according to a condition- Process Data&Time data using Date&Time nodes- Process data within a database with database nodesWorkflow description: A data set with customer transaction information is used in these examples.In the first example, the store with the highest sales is automatically identified using a flow variable. This process is repeated for each city in the data setusing a loop, and the resulting table is written to a separate file for each city.In the second example, the total sale amount is aggregated for individual customers according to the store types (online vs. onsite store), and the resultsare visualized by various visualization nodes. The visualization nodes are combined with configuration and widget nodes, allowing users to customizevisualization during configuration and interactively within the composite view. A switch node is used to support conditional execution of workflow branches.In the third example, processing of Date&Time data is showcased, including duration-based filtering and extraction of Date&Time field.In the fourth example, in-database processing is outlined. The data set, stored in a database, is aggregated, sorted, and filtered to select a subset ofrows from high-sale stores. This process is implemented within a database by a series of database nodes Session I: Enhance the Component with Configuration nodes and WidgetsSession II: Control the workflow with Case Switch node1. Work with an output table of the Processed Customer Data metanode:Obtain stats per customer using GroupBy node: Select customerID and Store Type as groups. Choose the following aggregation methods. First forAgeGroup& CustomerGroup, Sum for Price, Unique Count for ProductNr.2. Build a component for visualization Re-execution Component that can be configured by configuration and widget nodes. Details are described insidethe component.Step 1: Selecting store type. A user selects a store type (online vs. onsite) through a configuration node, and the selected store type is filtered.Step 2: Selecting customer group. A user selects a customer group (CC1, CC2, or CC3) through a widget node in a composite view. The selectedcustomer group is filtered. Alteration in the user-selection triggers re-execution of the component.Step 3: Selecting age group(s). A user selects an age group (or groups) though a widget node. The selected age group(s) is then filtered. The widget islinked to Refresh Button Widget node, enabling re-execution when triggered.Step 4: Selecting plot type. A user selects one of four plot types through a widget node. The selection is used to switch a workflow branch to producethe desired visualization. Alteration in the user-selection triggers re-execution of the visualization. Session III: Date&Time Data Type1. Work with Date column in the output table of the Processed Customer Data metanode:- Convert string Date to Date&Time data type (String to Date&Time node)- Filter the data by date, starting on March 1, 2020 until the current date, by Date&Time-based Row Filter node.- Extract Month(number), Quarter and Year into new columns with Extract Date&Time Fields node.- Aggregate monthly total sale via Pivot node with Month(number) as the group and Year as the pivot- Plot monthly sales numbers in a line plot with Line Plot node, each line representing different years Session III: In-Database Processing1. Connect to an H2 database and select a table CustomerData:- In H2 Connector node, specify the H2 database file database.mv.db located in the data folder*.- With DB Table Selector node, select a table CustomerData.2. Identify stores with total sale > 9000- Aggregate total sale by store using DB GroupBy node. The group is StoreID, and the aggregation method is SUM on Price.- Sort the stores according to their total sale in the descending order using DB Sorter node.- Select stores with total sale > 9000 using DB Row Filter node**.3. Keep only the records from stores with total sale > 9000- Use DB Joiner node to combine the CustomerData table (top input port, left table) and the table of stores with total sale >9000 from step 2 (bottominput port, right table). Use StoreID as the joining column from both tables.- Load the resulting table to the memory on KNIME Analytics Platform with DB Reader node.- Visualize the composition of the age groups in this subset of the data by Pie/Donut Chart node. Use AgeGroup as the Category dimension, and theOccurrence count as the Aggregation method.*Use the path described by the KNIME protocol starting from the current workflow, as knime://knime.workflow/../../data/database.mv.db**The condition is SUM(Price)>=9000 Select column DateConnect to thedatabaseSelectCostomerDataSet execution timeJoin city name with execution dateGroup: CityGroups: CustomerID & StoreTypeSelecting datastarting on March 1, 2020Write file with execution datePath flow variableMonth, Quarter & YearGroup: MonthPivot: YearAggregation: sum of PriceTotal sale duringthe course of ayearAggregated by StoreIDIn the order ofsales amountSelect the store withsales > 9000Selecting recordsfrom stores withsales > 9000 onlyTo a table inAnalytics PlatformVisualizingage groupcomposition String to Date&Time Store with HighestTotal Sale H2 Connector DB Table Selector Date&TimeConfiguration String Manipulation(Variable) Group Loop Start GroupBy Date&Time-basedRow Filter ProcessedCustomer Data Excel Writer Create File/FolderVariables Variable Loop End Re-executionComponent Extract Date&TimeFields Pivoting Line Plot DB GroupBy DB Sorter DB Row Filter DB Joiner DB Reader Pie/Donut Chart Session I: Apply Flow Variables and Path VariablesSession II: Control the Workflow with a Loop1. Find the store with the most sale:- Work with an output table of the Processed Customer Data metanode - a fully joined customer transaction data.- Use Row Aggregator node to find the total sale for each store.- Sort the total price in descending order with Sorter node.- Convert the first row of the table to a variable with Table Row to Variable node and use it to filter the most successful store in Row Filter.- Create a metanode Store with Highest Total Sale out of all nodes from above.2. Build a loop that finds the store with the highest total sale in each city and writes the records from that store to a separate file:- Before Store with Highest Total Sale metanode, insert Group Loop Start node that passes on the records from each city in each iteration.- Extract the current execution time with Date&Time Configuration node*. Pass it on as a flow variable Execution_time.- With String Manipulation (Variable) node, combine the city name (flow variable City) with the execution date (flow variable Execution_time). Replaceany colon (":") with a hyphen ("-").** The resulting string is stored as a flow variable File_name.- For each city, create a path variable (Create File/Folder Variables). The location is the folder temp located under the data folder. For the filename,under the Flow Variables tab, set path_value to File_name, the flow variable from the previous step. The File extension is xlsx. Pass on the resultingpath variable as output-location.- Write the output from Store with Highest Total Sale metanode to an Excel file with Excel Writer node. Make a flow variable connection betweenCreate File/Folder Variables node to Excel Writer node. The output location is specified by the path variable output-location.- Close and execute the loop by outputting flow variables from Excel Writer node to Variable Loop End node.*Connect the output from the Group Loop Start node (contains flow variable City) to the Date&Time Configuration node (contains current executiontime information)**Use the expression join($${SCity}$$,replaceChars($${SExecution_date}$$,":" ,"-" )) inside String Manipulation (Variable) node. L2 Demo Workflow for Sessions 1-3Customer Transaction Analysis Learning objectives: - Control node settings with flow variables- Repeat the same task with a loop- Execute different branches of a workflow according to a condition- Process Data&Time data using Date&Time nodes- Process data within a database with database nodesWorkflow description: A data set with customer transaction information is used in these examples.In the first example, the store with the highest sales is automatically identified using a flow variable. This process is repeated for each city in the data setusing a loop, and the resulting table is written to a separate file for each city.In the second example, the total sale amount is aggregated for individual customers according to the store types (online vs. onsite store), and the resultsare visualized by various visualization nodes. The visualization nodes are combined with configuration and widget nodes, allowing users to customizevisualization during configuration and interactively within the composite view. A switch node is used to support conditional execution of workflow branches.In the third example, processing of Date&Time data is showcased, including duration-based filtering and extraction of Date&Time field.In the fourth example, in-database processing is outlined. The data set, stored in a database, is aggregated, sorted, and filtered to select a subset ofrows from high-sale stores. This process is implemented within a database by a series of database nodes Session I: Enhance the Component with Configuration nodes and WidgetsSession II: Control the workflow with Case Switch node1. Work with an output table of the Processed Customer Data metanode:Obtain stats per customer using GroupBy node: Select customerID and Store Type as groups. Choose the following aggregation methods. First forAgeGroup& CustomerGroup, Sum for Price, Unique Count for ProductNr.2. Build a component for visualization Re-execution Component that can be configured by configuration and widget nodes. Details are described insidethe component.Step 1: Selecting store type. A user selects a store type (online vs. onsite) through a configuration node, and the selected store type is filtered.Step 2: Selecting customer group. A user selects a customer group (CC1, CC2, or CC3) through a widget node in a composite view. The selectedcustomer group is filtered. Alteration in the user-selection triggers re-execution of the component.Step 3: Selecting age group(s). A user selects an age group (or groups) though a widget node. The selected age group(s) is then filtered. The widget islinked to Refresh Button Widget node, enabling re-execution when triggered.Step 4: Selecting plot type. A user selects one of four plot types through a widget node. The selection is used to switch a workflow branch to producethe desired visualization. Alteration in the user-selection triggers re-execution of the visualization. Session III: Date&Time Data Type1. Work with Date column in the output table of the Processed Customer Data metanode:- Convert string Date to Date&Time data type (String to Date&Time node)- Filter the data by date, starting on March 1, 2020 until the current date, by Date&Time-based Row Filter node.- Extract Month(number), Quarter and Year into new columns with Extract Date&Time Fields node.- Aggregate monthly total sale via Pivot node with Month(number) as the group and Year as the pivot- Plot monthly sales numbers in a line plot with Line Plot node, each line representing different years Session III: In-Database Processing1. Connect to an H2 database and select a table CustomerData:- In H2 Connector node, specify the H2 database file database.mv.db located in the data folder*.- With DB Table Selector node, select a table CustomerData.2. Identify stores with total sale > 9000- Aggregate total sale by store using DB GroupBy node. The group is StoreID, and the aggregation method is SUM on Price.- Sort the stores according to their total sale in the descending order using DB Sorter node.- Select stores with total sale > 9000 using DB Row Filter node**.3. Keep only the records from stores with total sale > 9000- Use DB Joiner node to combine the CustomerData table (top input port, left table) and the table of stores with total sale >9000 from step 2 (bottominput port, right table). Use StoreID as the joining column from both tables.- Load the resulting table to the memory on KNIME Analytics Platform with DB Reader node.- Visualize the composition of the age groups in this subset of the data by Pie/Donut Chart node. Use AgeGroup as the Category dimension, and theOccurrence count as the Aggregation method.*Use the path described by the KNIME protocol starting from the current workflow, as knime://knime.workflow/../../data/database.mv.db**The condition is SUM(Price)>=9000 Select column DateConnect to thedatabaseSelectCostomerDataSet execution timeJoin city name with execution dateGroup: CityGroups: CustomerID & StoreTypeSelecting datastarting on March 1, 2020Write file with execution datePath flow variableMonth, Quarter & YearGroup: MonthPivot: YearAggregation: sum of PriceTotal sale duringthe course of ayearAggregated by StoreIDIn the order ofsales amountSelect the store withsales > 9000Selecting recordsfrom stores withsales > 9000 onlyTo a table inAnalytics PlatformVisualizingage groupcomposition String to Date&Time Store with HighestTotal Sale H2 Connector DB Table Selector Date&TimeConfiguration String Manipulation(Variable) Group Loop Start GroupBy Date&Time-basedRow Filter ProcessedCustomer Data Excel Writer Create File/FolderVariables Variable Loop End Re-executionComponent Extract Date&TimeFields Pivoting Line Plot DB GroupBy DB Sorter DB Row Filter DB Joiner DB Reader Pie/Donut Chart

Nodes

Extensions

Links