Icon

02 Flow Variables and Components

Flow Variables and Components - Exercise

This workflow shows a hands-on exercise in the L2-DS Introduction to KNIME Analytics Platform for Data Scientists - Advanced course

Task 1: Flow Variables1. Read the New York City Airbnb data by executing the CSV Reader node2. Count the number of rows by host id3. Sort the aggregated data descending by the count of rows4. Transform the top row into a flow variable5. Use a Row Filter node to filter the original dataset. Overwrite the filtering pattern with the host_id flow variable. Task 2: Creating a shared component1. Continue working with the New York City Airbnb data and create a Value Selection Configuration node to select one neighborhood group inthe data2. Filter the data based on the selected neighborhood group3. Create an Integer Configuration node to select the maximum price per night4. Filter the data in the selected neighborhood group to the rooms and apartments below or equal to this price per night. Notice that you need totype an arbitrary number in the upper bound field in order to make the configuration option appear in the Flow Variables tab. 5. Encapsulate the configuration nodes and the Row Filter nodes into a component. Make sure that your component has a data output port! Task 3: Using a shared component1. Drag and drop the Interactive Data Cleaning component from the KNIME Hub or from the EXAMPLES Server2. Connect it to the CSV Reader node, execute, and open its interactive view3. Apply automatic type guessing, remove rows with missing values, and remove duplicate rows. Click Apply and Close. Read AB_NYC_2019dataCount byhost idId with mostroomsFilter to therooms of the idwith most rooms InteractiveData Cleaning CSV Reader GroupBy Sorter Row Filter Table Rowto Variable Filter by neighborhood groupand maximum price per night Task 1: Flow Variables1. Read the New York City Airbnb data by executing the CSV Reader node2. Count the number of rows by host id3. Sort the aggregated data descending by the count of rows4. Transform the top row into a flow variable5. Use a Row Filter node to filter the original dataset. Overwrite the filtering pattern with the host_id flow variable. Task 2: Creating a shared component1. Continue working with the New York City Airbnb data and create a Value Selection Configuration node to select one neighborhood group inthe data2. Filter the data based on the selected neighborhood group3. Create an Integer Configuration node to select the maximum price per night4. Filter the data in the selected neighborhood group to the rooms and apartments below or equal to this price per night. Notice that you need totype an arbitrary number in the upper bound field in order to make the configuration option appear in the Flow Variables tab. 5. Encapsulate the configuration nodes and the Row Filter nodes into a component. Make sure that your component has a data output port! Task 3: Using a shared component1. Drag and drop the Interactive Data Cleaning component from the KNIME Hub or from the EXAMPLES Server2. Connect it to the CSV Reader node, execute, and open its interactive view3. Apply automatic type guessing, remove rows with missing values, and remove duplicate rows. Click Apply and Close. Read AB_NYC_2019dataCount byhost idId with mostroomsFilter to therooms of the idwith most rooms InteractiveData Cleaning CSV Reader GroupBy Sorter Row Filter Table Rowto Variable Filter by neighborhood groupand maximum price per night

Nodes

Extensions

Links