Icon

02 Flow Variables and Components LAB

Flow Variables and Components - Exercise (Solution)

This workflow shows a solution to a hands-on exercise in the L2-DS Introduction to KNIME Analytics Platform for Data Scientists - Advanced course

Task 1: Flow Variables1. Read the New York City Airbnb data by executing the CSV Reader node2. Count the number of rows by host id3. Sort the aggregated data descending by the count of rows4. Transform the top row into a flow variable5. Use a Row Filter node to filter the original dataset. Overwrite the filtering pattern with the host_idflow variable. Task 2: Creating a shared component1. Continue working with the New York City Airbnb data and create a Value Selection Configuration nodeto select one neighborhood group in the data2. Filter the data based on the selected neighborhood group3. Create an Integer Configuration node to select the maximum price per night4. Filter the data in the selected neighborhood group to the rooms and apartments below or equal to thisprice per night. Notice that you need to type an arbitrary number in the upper bound field in order tomake the configuration option appear in the Flow Variables tab. 5. Encapsulate the configuration nodes and the Row Filter nodes into a component. Make sure that yourcomponent has a data output port! Task 3: Using a shared component1. Drag and drop the Interactive Data Cleaning component from the KNIME Hub or from the EXAMPLESServer2. Connect it to the CSV Reader node, execute, and open its interactive view3. Apply automatic type guessing, remove rows with missing values, and remove duplicate rows. ClickApply and Close. Count byorder idCount of highest order in descending orderCustom DataNode 56 InteractiveData Cleaning GroupBy Sorter Table Rowto Variable CSV Reader Row Filter North Region only, displays profitbetween $500-$1000 for technology sales Task 1: Flow Variables1. Read the New York City Airbnb data by executing the CSV Reader node2. Count the number of rows by host id3. Sort the aggregated data descending by the count of rows4. Transform the top row into a flow variable5. Use a Row Filter node to filter the original dataset. Overwrite the filtering pattern with the host_idflow variable. Task 2: Creating a shared component1. Continue working with the New York City Airbnb data and create a Value Selection Configuration nodeto select one neighborhood group in the data2. Filter the data based on the selected neighborhood group3. Create an Integer Configuration node to select the maximum price per night4. Filter the data in the selected neighborhood group to the rooms and apartments below or equal to thisprice per night. Notice that you need to type an arbitrary number in the upper bound field in order tomake the configuration option appear in the Flow Variables tab. 5. Encapsulate the configuration nodes and the Row Filter nodes into a component. Make sure that yourcomponent has a data output port! Task 3: Using a shared component1. Drag and drop the Interactive Data Cleaning component from the KNIME Hub or from the EXAMPLESServer2. Connect it to the CSV Reader node, execute, and open its interactive view3. Apply automatic type guessing, remove rows with missing values, and remove duplicate rows. ClickApply and Close. Count byorder idCount of highest order in descending orderCustom DataNode 56InteractiveData Cleaning GroupBy Sorter Table Rowto Variable CSV Reader Row Filter North Region only, displays profitbetween $500-$1000 for technology sales

Nodes

Extensions

Links