Icon

Concatenate Tables and Combine String Cells

<p><strong>Concatenate Tables and Combine String Cells</strong></p><p>This workflow shows how to concatenate data tables from two different sheets and how to combine string-type columns using KNIME Analytics Platform.</p><p>The aim is to access two different sheets of the same Excel file (<em>Olympic_Athlete_Event_Results.xlsx</em>), concatenate the data of the two sheets into one table, and combine the values of the two columns "sport" and "event" into one column.</p><p>💡 <em>To view each node's configuration, select the node and see the configuration pane on the right side of the workflow editor.</em></p><p>Let's walk through the different nodes involved in this operation:</p><p><strong>Excel Reader nodes:</strong></p><ul><li><p>Since the folder with the data is already included when you download the workflow, in the "File and Sheet" tab, we choose to "Read from" the "Current workflow data area" and select the dataset.</p></li><li><p>In the "Data Area" tab, we select to read the "Whole sheet" since the spreadsheet already has a standard format (i.e., non-empty rows or columns, etc.) and we intend to keep its original structure.</p></li></ul><p><strong>Concatenate node:</strong></p><ul><li><p>In the configuration window, we control how the two input tables are concatenated: Columns with the same names are concatenated. If one input table contains columns that the other table does not, the columns can either be filled with missing values (union - all columns are retained in the output table) or filtered out (intersection - only columns that are contained in both tables are concatenated).</p></li><li><p>For the Row IDs, we choose to "Reuse existing" Row IDs and "Append suffix" to handle duplicate Row Ids. It means, if we have two rows with the id "Row0", one gets changed to "Row0_dup".</p></li></ul><p><strong>Column Combiner node:</strong></p><ul><li><p>We want to combine the values of the two columns "sport" and "event" into one column called "discipline".</p></li><li><p>In the include/exclude panel, we select only the columns we want to combine ("sport and "discipline"). As delimiter, we define a colon (":"), and we choose to remove the input columns from the output table.</p></li></ul><p><strong>Excel Writer node:</strong></p><ul><li><p>We append the dataset to a new sheet called "Combined_Sport_Event" in the existing Excel file located in the workflow data area.</p></li><li><p>After executing the node, the file will open automatically.</p></li></ul><p>As you can see from the output, we concatenated the two input tables and replaced the "sport" and "event" columns with the combined column "discipline".</p>

Concatenate Tables and Combine String Cells


This workflow demonstrates a use case of ETL (Extract, Load, Transform). It accesses data from multiple Excel sheets of the same file, merges the data by concatenating the two datasets and combining columns, and lastly appends the combined data table to a new sheet in an Excel file.

In this example, we access the athlete event results of the 2020 Summer Olympic Games. The data is separated in two sheets (identical table structure) in the same Excel file:

  1. Sheet 1: The results for Germany

  2. Sheet 2: The results for the USA

We proceed to concatenate the two tables and combine the values of the two columns "sport" and "event" into a new "discipline" column. Lastly, we write the combined data table to a new sheet in the existing Excel file.

For a detailed overview of each node in this workflow, refer to the workflow description in the Info panel.


💡 To view each node's configuration, select the node and see the configuration pane on the right side of the workflow editor.

Data access

Read athlete event results data of the 2020 Summer Olympic Games

  • Sheet 1: Germany

  • Sheet 2: USA

Data export

Append combined data to new sheet in Olympic_Athlete_Event_Results.xlsx

Data manipulation

  • Concatenating the two datasets

  • Combining the "sport" and "event" columns into one "discipline" column

Read athlete event results Sheet 2
Excel Reader
Read athlete event results Sheet 1
Excel Reader
Combine values of columns"sport" and "event" intonew "discipline" column
Column Combiner
Append new sheet to dataset:"Combined_Sport_Event"
Excel Writer
Concatenate dataof the two sheets
Concatenate

Nodes

Extensions

Links