Icon

2 Flow Variables and Components

Flow Variables and Components - Exercise (Solution)

This workflow shows a solution to a hands-on exercise in the L2-DS Introduction to KNIME Analytics Platform for Data Scientists - Advanced course

Task 1: Flow Variables1. Read the New York City Airbnb data by executing the CSV Reader node2. Count the number of rows by host id3. Sort the aggregated data descending by the count of rows4. Transform the top row into a flow variable5. Use a Row Filter node to filter the original dataset. Overwrite the filtering pattern with the host_idflow variable. Task 2: Creating a shared component1. Continue working with the New York City Airbnb data and create a Value Selection Configuration nodeto select one neighborhood group in the data2. Filter the data based on the selected neighborhood group3. Create an Integer Configuration node to select the maximum price per night4. Filter the data in the selected neighborhood group to the rooms and apartments below or equal to thisprice per night. Notice that you need to type an arbitrary number in the upper bound field in order tomake the configuration option appear in the Flow Variables tab. 5. Encapsulate the configuration nodes and the Row Filter nodes into a component. Make sure that yourcomponent has a data output port! Task 3: Using a shared component1. Drag and drop the Interactive Data Cleaning component from the KNIME Hub or from the EXAMPLESServer2. Connect it to the CSV Reader node, execute, and open its interactive view3. Apply automatic type guessing, remove rows with missing values, and remove duplicate rows. ClickApply and Close. Read AB_NYC_2019dataCount byhost idId with mostroomsFilter to therooms of the idwith most rooms InteractiveData Cleaning CSV Reader GroupBy Sorter Row Filter Table Rowto Variable Filter by neighborhood groupand maximum price per night Task 1: Flow Variables1. Read the New York City Airbnb data by executing the CSV Reader node2. Count the number of rows by host id3. Sort the aggregated data descending by the count of rows4. Transform the top row into a flow variable5. Use a Row Filter node to filter the original dataset. Overwrite the filtering pattern with the host_idflow variable. Task 2: Creating a shared component1. Continue working with the New York City Airbnb data and create a Value Selection Configuration nodeto select one neighborhood group in the data2. Filter the data based on the selected neighborhood group3. Create an Integer Configuration node to select the maximum price per night4. Filter the data in the selected neighborhood group to the rooms and apartments below or equal to thisprice per night. Notice that you need to type an arbitrary number in the upper bound field in order tomake the configuration option appear in the Flow Variables tab. 5. Encapsulate the configuration nodes and the Row Filter nodes into a component. Make sure that yourcomponent has a data output port! Task 3: Using a shared component1. Drag and drop the Interactive Data Cleaning component from the KNIME Hub or from the EXAMPLESServer2. Connect it to the CSV Reader node, execute, and open its interactive view3. Apply automatic type guessing, remove rows with missing values, and remove duplicate rows. ClickApply and Close. Read AB_NYC_2019dataCount byhost idId with mostroomsFilter to therooms of the idwith most roomsInteractiveData Cleaning CSV Reader GroupBy Sorter Row Filter Table Rowto Variable Filter by neighborhood groupand maximum price per night

Nodes

Extensions

Links