Icon

How to Load, Combine, and Explore Data

<p><strong>How to Load, Combine, and Explore Data</strong></p><p>We have access to historical Olympics data, however, the information is saved in different tables:</p><ul><li><p>First table: Athlete event results of the Summer Olympics Games 1972-2020 for sport "Athletics".</p></li><li><p>Second table: A dictionary mapping the NOC countries' 3-digit ISO code to their full names.</p></li></ul><p>We want to merge the two datasets and explore insights from the data.</p><p>Which country has had the most athletes participate? <strong>Visualize the result</strong>.</p><p><strong>In this workflow, you will learn how to:</strong></p><ul><li><p>Load data into a KNIME workflow</p></li><li><p>Merge datasets</p></li><li><p>Perform basic data aggregation</p></li><li><p>Inspect the results</p></li></ul><p><strong>💡 At a glance:</strong></p><p>In KNIME, a node can be in four different states:</p><ol><li><p><strong>Not configured</strong>🔴. The node is waiting for configuration or incoming data.</p></li><li><p><strong>Configured</strong>🟡. The node has been configured and can be executed.</p></li><li><p><strong>Executed</strong>🟢. The nodes has been successfully executed. Results can be viewed in the node monitor and be used in downstream nodes.</p></li><li><p><strong>Error</strong>❌. The node has encountered an error during execution.</p></li></ol><p>⚙️ To configure a node, click the node to open the configuration panel (right side of the workflow canvas). Change the settings and don't forget to click <strong>"Apply"</strong>.</p><p>▶️ After configuration, you can execute the node. Hover over the node and click the "Execute" button in the node action bar.</p><p>🤖 If you are stuck, let K-AI, our AI assistant, help you build workflows. Access K-AI in the "K-AI" panel on the left side of the workflow canvas.</p><p>For detailed instructions and a getting started guide, we recommend the KNIME Documentation.</p>

How to Load, Combine, and Explore Data


We have access to historical Olympics data, however, the information is saved in different tables:

  • First table: Athlete event results of the Summer Olympics Games 1972-2020 for sport "Athletics".

  • Second table: A dictionary mapping the NOC countries' 3-digit ISO code to their full names.

We want to merge the two datasets and explore insights from the data.

Which country has had the most athletes participate? Visualize the result.


In this workflow, you will learn how to:

  • Load data into a KNIME workflow

  • Merge datasets

  • Perform basic data aggregation

  • Inspect the results

We load two datasets related to the Olympics Games into KNIME:

  1. The athlete results of the Summer Olympics 1972-2020 (sport="Athletics").

  2. A list of all NOCs and their ISO codes.

We perform a value lookup operation to add the full country names from the second data table to the event results data (first data table).

  • Lookup column: country_noc (first table)

  • Key column: noc (second table)

Step 1: Access data
Step 2: Merge data
Step 3: Aggregate data

We sort the aggregated data table from highest to lowest

  • Column to sort: OCCURRENCE_COUNT

  • Order: Descending

We count the number of athletes participated by country:

  • Category column: country

  • Aggregation: Occurrence count

We only keep the top 5 countries with the highest number of participating athletes.

  • Column to filter: Row number

  • Criterion: Less than or equal to 5

💡 Click on a node to observe the respective dataset in the node monitor at the bottom of the screen.

We visualize the result in a bar chart.

  • Category dimension: country

  • Aggregation: None

  • Frequency dimensions: OCCURRENCE_COUNT

Alternatively, give the bar chart a title and custom labels for the x- and y-axes.

Step 5: Filter data

💡 Click on the node to explore the interactive view in the node monitor at the bottom of the screen.

Step 4: Sort data
Step 6: Visualize data
💡 At a glance:

In KNIME, a node can be in four different states:

  1. Not configured🔴. The node is waiting for configuration or incoming data.

  2. Configured🟡. The node has been configured and can be executed.

  3. Executed🟢. The nodes has been successfully executed. Results can be viewed in the node monitor and be used in downstream nodes.

  4. Error❌. The node has encountered an error during execution.

⚙️ To configure a node, click the node to open the configuration panel (right side of the workflow canvas). Change the settings and don't forget to click "Apply".

▶️ After configuration, you can execute the node. Hover over the node and click the "Execute" button in the node action bar.

🤖 If you are stuck, let K-AI, our AI assistant, help you build workflows. Access K-AI in the "K-AI" panel on the left side of the workflow canvas.


For detailed instructions and a getting started guide, we recommend the KNIME Documentation.

💡 To connect two nodes, click a node's output port and drop the connection to another node's input port.

💡 Currently installed nodes are available in the node repository ("Nodes" panel on the left side of the workflow editor). To add a node to your workflow, drag and drop it into the workflow canvas.

Count number ofathletes per country
Row Aggregator
Add fullcountry name
Value Lookup
Sort byOCCURRENCE_COUNTin descending order
Sorter
Load event results:Summer Olympics 1972-2020(sport=Athletics)
Table Creator
Keep only thefirst 5 rows
Row Filter
Visualize top 5countries with most participating athletes
Bar Chart
Load dictionary:Country to ISO-code
Table Creator

Nodes

Extensions

Links