Icon

CountIf and SumIf

<p><strong>CountIf and SumIf</strong></p><p>This workflow shows how to perform a <strong>SumIf </strong>and a <strong>CountIf </strong>function using KNIME Analytics Platform. Of course, there are different ways how to get this done, hence, in the following example, we show you <em>one </em>of the possible solutions.</p><p>Using Olympics athlete data, we want to calculate the total number of medals won and count the appearances at the Olympic Games - once under certain conditions (female swimmers) and once the grand totals.</p><p>💡 <em>To view each node's configuration, select the node and see the configuration pane on the right side of the workflow editor.</em></p><p>We will see how to do it node by node:</p><p><strong>Excel Reader node:</strong></p><ul><li><p>Since the folder with the data is already included when you download the workflow, in the "File and Sheet" tab, we choose to "Read from" the "Current workflow data area" and select the dataset.</p></li><li><p>In the "Data Area" tab, we select to read the "Whole sheet" since the spreadsheet already has a standard format (i.e., non-empty rows or columns, etc.).</p></li></ul><p><strong>Expression Row Filter node:</strong></p><ul><li><p>This node implements the <em>IF </em>logic. In our case: Keep rows where the value for "sport" equals "Swimming" and the value for "sex" equals "Female".</p></li><li><p>We can add multiple conditions and build complex rules using the expression builder in the node configuration window.</p></li></ul><p><strong>Duplicate Row Filter node (bottom branch):</strong></p><ul><li><p>We exclude all columns that can have different values for each swimmer: "event", "medal", and "isTeamSport".</p></li><li><p>After running the node, we will have one row per athlete and Olympic edition they have participated in.</p></li></ul><p><strong>Row Aggregator node (bottom branch):</strong></p><ul><li><p>We count how often each athlete participated in the Olympics - <em>CountIf</em></p></li><li><p>As aggregation method, we use "Occurrence count" and as category column we set the athlete's "name".</p></li><li><p>The top output port contains the <em>CountIf </em>result, however, the bottom output port contains the <em>CountIf </em>grand totals which we will use later.</p></li></ul><p><strong>Row Aggregator node (top branch):</strong></p><ul><li><p>We calculate the number of medals won by each athlete - <em>SumIf</em></p></li><li><p>As aggregation method, we use "Sum", including only the "medal" column to calculate the total medals. As category column we set the athlete's "name".</p></li><li><p>The top output port contains the <em>SumIf </em>result, however, the bottom output port contains the <em>SumIf </em>grand totals which we will use later.</p></li></ul><p><strong>Value Lookup + Column Renamer + Sorter nodes (top branch):</strong></p><ul><li><p>We combine each athlete's medal count with occurrence count.</p></li><li><p>Then, we give the columns meaningful names and sort the data by appearance in descending order.</p></li></ul><p><strong>Column Appender + Column Renamer (bottom branch):</strong></p><ul><li><p>We combine the <em>SumIf </em>and <em>CountIf </em>totals and give the columns meaningful names.</p></li></ul><p><strong>Excel writer node:</strong></p><ul><li><p>We write the two tables to two separate sheets in a new file "countif_sumif_results.xlsx".</p></li><li><p>The top port writes the CountIf and SumIf results for each athlete to the sheet "countif_sumif_by_athlete", the bottom port writes the grand totals to the sheet "countif_sumif_totals".</p></li><li><p>After executing the node, the file will open automatically.</p></li></ul><p>As you can see from the output, we calculated the total medals won and appearance count for each female athlete in the discipline "Swimming" as well as the overall totals.</p>

CountIf and SumIf

This workflow shows how to perform a SumIf and a CountIf function using KNIME Analytics Platform. Of course, there are different ways how to get this done, hence, in the following example, we show you one of the possible solutions.

Using Olympics athlete data, we want to calculate the total number of medals won and count the appearances at the Olympic Games - once under certain conditions (female swimmers) and once the grand totals.

💡 To view each node's configuration, select the node and see the configuration pane on the right side of the workflow editor.

We will see how to do it node by node:

Excel Reader node:

  • Since the folder with the data is already included when you download the workflow, in the "File and Sheet" tab, we choose to "Read from" the "Current workflow data area" and select the dataset.

  • In the "Data Area" tab, we select to read the "Whole sheet" since the spreadsheet already has a standard format (i.e., non-empty rows or columns, etc.).

Expression Row Filter node:

  • This node implements the IF logic. In our case: Keep rows where the value for "sport" equals "Swimming" and the value for "sex" equals "Female".

  • We can add multiple conditions and build complex rules using the expression builder in the node configuration window.

Duplicate Row Filter node (bottom branch):

  • We exclude all columns that can have different values for each swimmer: "event", "medal", and "isTeamSport".

  • After running the node, we will have one row per athlete and Olympic edition they have participated in.

Row Aggregator node (bottom branch):

  • We count how often each athlete participated in the Olympics - CountIf

  • As aggregation method, we use "Occurrence count" and as category column we set the athlete's "name".

  • The top output port contains the CountIf result, however, the bottom output port contains the CountIf grand totals which we will use later.

Row Aggregator node (top branch):

  • We calculate the number of medals won by each athlete - SumIf

  • As aggregation method, we use "Sum", including only the "medal" column to calculate the total medals. As category column we set the athlete's "name".

  • The top output port contains the SumIf result, however, the bottom output port contains the SumIf grand totals which we will use later.

Value Lookup + Column Renamer + Sorter nodes (top branch):

  • We combine each athlete's medal count with occurrence count.

  • Then, we give the columns meaningful names and sort the data by appearance in descending order.

Column Appender + Column Renamer (bottom branch):

  • We combine the SumIf and CountIf totals and give the columns meaningful names.

Excel writer node:

  • We write the two tables to two separate sheets in a new file "countif_sumif_results.xlsx".

  • The top port writes the CountIf and SumIf results for each athlete to the sheet "countif_sumif_by_athlete", the bottom port writes the grand totals to the sheet "countif_sumif_totals".

  • After executing the node, the file will open automatically.

As you can see from the output, we calculated the total medals won and appearance count for each female athlete in the discipline "Swimming" as well as the overall totals.

CountIf

Data access

Read file located in workflow data area

Create logic & filter

Keep only rows for female swimmers

SumIf

Data export

Save results as separate sheets to a new file

  • CountIf_SumIf_by_athlete

  • CountIf_SumIf_totals

CountIf & SumIf grand totals

CountIf & SumIf results by athlete

CountIf and SumIf


This workflow demonstrates how to filter, aggregate, and manipulate the data to perform a SumIf and a CountIf function (i.e., aggregating the data under a certain condition). Lastly, it exports the resulting data table to an external file.

Note. There are different ways how to get this done. In this example, we show you one of the possible solutions.

In this example, we access the athlete event results of the 1920-2020 Summer Olympic Games. We then filter the data to include only female swimmers and proceed to aggregate the data accordingly. In the top branch we calculate the total number of medals each athlete won, and in the bottom branch we count each athlete's appearance in the Olympic Games. Lastly, we export both tables, each to a separate sheet.

For a detailed overview of each node in this workflow, refer to the workflow description in the Info panel.


💡 To view each node's configuration, select the node and see the configuration pane on the right side of the workflow editor.

Combine SumIf andCountIf grand totals
Column Appender
Column Renamer
Read Olympics data
Excel Reader
Count each athlete'sappearances in Olympics
Row Aggregator
Calculate sum of medalsfor each athlete
Row Aggregator
Remove duplicate rows
Duplicate Row Filter
Create a new filewith two new sheets
Excel Writer
If logic matches, then output rows
Expression Row Filter
Column Renamer
Sort by appearancein descending order
Sorter
Append medal sumto appearance count
Value Lookup

Nodes

Extensions

Links