Icon

Sessions1-3

Session I: Apply Flow Variables and Path Variables
Session II: Control the Workflow with a Loop

  1. Find the store with the most sale:

    • Work with a table Processed Customer Data.table, read by Table Reader node

    • Use Row Aggregator nodeto find the total sale for each store

    • Sort the total price in descending order with Sorter node

    • Convert the first row of the table to a variable with Table Row to Variable node and use it to filter the most successful store in Row Filter

    • Create a metanode Store with Highest Total Sale out of all nodes from above

  2. Build a loop that finds the store with the highest total sale in each city and writes the records from that store to a separate file:

    • Before Store with Highest Total Sale metanode, insert Group Loop Start node that passes on the records from each city in each iteration

    • Extract the current execution time with Date&Time Configuration node*. Pass it on as a flow variable Execution_time.

    • With Variable Expression node, combine the city name (flow variable City) with the execution date (flow variable Execution_time). Replace any colon (":") with a hyphen ("-").** The resulting string is stored as a flow variable File_name.

    • For each city, create a path variable (Create File/Folder Variables). The location is the folder temp located under the data folder. For the filename, under the Flow Variables tab, set path_value to File_name, the flow variable from the previous step. The File extension is xlsx. Pass on the resulting path variable as output-location.

    • Write the output from Store with Highest Total Sale metanode to an Excel file with Excel Writer node. Make a flow variable connection between Create File/Folder Variables node to Excel Writer node. The output location is specified by the path variable output-location.

    • Close and execute the loop by outputting flow variables from Excel Writer node to Variable Loop End node.


*Connect the output from the Group Loop Start node (contains flow variable City) to the Date&Time Configuration node (contains current execution time information)

**You can instruct K-AI to create an expression to do this. Or use the expression join($${SCity}$$,replaceChars($${SExecution_date}$$,":" ,"-" ))



L2 Demo Workflow for Sessions 1-3

Customer Transaction Analysis

Learning objectives:

  • Control node settings with flow variables

  • Repeat the same task with a loop

  • Execute different branches of a workflow according to a condition

  • Process Data&Time data using Date&Time nodes

  • Process data within a database with database nodes


Workflow description: A data set with customer transaction information is used in these examples.

In the first example, the store with the highest sales is automatically identified using a flow variable. This process is repeated for each city in the data set using a loop, and the resulting table is written to a separate file for each city.

In the second example, the total sale amount is aggregated for individual customers according to the store types (online vs. onsite store), and the results are visualized by various visualization nodes. The visualization nodes are combined with configuration and widget nodes, allowing users to customize visualization during configuration and interactively within the composite view. A switch node is used to support conditional execution of workflow branches.

In the third example, processing of Date&Time data is showcased, including duration-based filtering and extraction of Date&Time field.

In the fourth example, in-database processing is outlined. The data set, stored in a database, is aggregated, sorted, and filtered to select a subset of rows from high-sale stores. This process is implemented within a database by a series of database nodes.

Session I: Enhance the Component with Configuration nodes and Widgets
Session II: Control the workflow with Case Switch node

  1. Work with a table Processed Customer Data.table, read by Table Reader node:

    • Obtain stats per customer using GroupBy node:

      • Select customerID and Store Type as groups

      • Choose the following aggregation methods

        • First for AgeGroup, CustomerGroup

        • Sum for Price

        • Unique Count for ProductNr

  2. Build a component for visualization Re-execution Component that can be configured by configuration and widget nodes. Details are described inside the component.

    • Step 1: Selecting store type. A user selects a store type (online vs. onsite) through a configuration node, and the selected store type is filtered.

    • Step 2: Selecting customer group. A user selects a customer group (CC1, CC2, or CC3) through a widget node in a composite view. The selected customer group is filtered. Alteration in the user-selection triggers re-execution of the component

    • Step 3: Selecting age group(s). A user selects an age group (or groups) though a widget node. The selected age group(s) is then filtered. The widget is linked to Refresh Button Widget node, enabling re-execution when triggered.

    • Step 4: Selecting plot type. A user selects one of four plot types through a widget node. The selection is used to switch a workflow branch to produce the desired visualization. Alteration in the user-selection triggers re-execution of the visualization


Session III: Date&Time Data Type

  1. Work with Date column in a table Processed Customer Data.table, read by Table Reader node:

    • Convert string Date to Date&Time data type (String to Date&Time node)

    • Filter the data by date, starting on March 1, 2020 until the current date, by Date&Time-based Row Filter node.

    • Extract Month(number), Quarter and Year into new columns with Extract Date&Time Fields node.

    • Aggregate monthly total sale via Pivot node with Month(number) as the group and Year as the pivot

    • Plot monthly sales numbers in a line plot with Line Plot node, each line representing different years


Session III: In-Database Processing

  1. Connect to an H2 database and select a table CustomerData:

    • In H2 Connector node, specify the H2 database file database.mv.db located in the data folder*.

    • With DB Table Selector node, select a schema PUBLIC, then select a table CustomerData

  2. Identify stores with total sale > 9000

    • Aggregate total sale by store using DB GroupBy node. The group is StoreID, and the aggregation method is SUM on Price.

    • Sort the stores according to their total sale in the descending order using DB Sorter node.

    • Select stores with total sale > 9000 using DB Row Filter node**.

  3. Keep only the records from stores with total sale > 9000

    • Use DB Joiner node to combine the CustomerData table (top input port, left table) and the table of stores with total sale >9000 from step 2 (bottom input port, right table). Use StoreID as the joining column from both tables

    • Load the resulting table to the memory on KNIME Analytics Platform with DB Reader node.

    • Visualize the composition of the age groups in this subset of the data by Pie Chart node. Use AgeGroup as the Category dimension, and the Occurrence count as the Aggregation method


*Use the path described by the KNIME protocol starting from the current workflow, as knime://knime.workflow/../../data/database.mv.db

**The condition is SUM(Price)>=9000


Month, Quarter & Year
Date&Time Part Extractor
Re-execution Component
Group: MonthPivot: YearAggregation: sum of Price
Pivot
Total sale duringthe course of ayear
Line Plot
Path flow variable
Create File/Folder Variables
To a table inAnalytics Platform
DB Reader
Store with Highest Total Sale
Visualizingage groupcomposition
Pie Chart
Write file with execution date
Excel Writer
Selecting datastarting on March 1, 2020
Date&Time-based Row Filter
Set execution time
Date&Time Configuration
Connect to thedatabase
H2 Connector
Join city name with execution date
Variable Expression
In the order ofsales amount
DB Sorter
Joined data
Table Reader
Aggregate by StoreID
DB GroupBy
Selecting recordsfrom stores withsales > 9000 only
DB Joiner
Variable Loop End
Under PUBLICselect CostomerData
DB Table Selector
Select the store withsales > 9000
DB Row Filter
Group: City
Group Loop Start
Groups: CustomerID & StoreType
GroupBy
Select column Date
String to Date&Time

Nodes

Extensions

Links