<p><strong>CountIf and SumIf</strong></p><p>This workflow shows how to perform a <strong>SumIf </strong>and a <strong>CountIf </strong>function using KNIME Analytics Platform. Of course, there are different ways how to get this done, hence, in the following example, we show you <em>one </em>of the possible solutions.</p><p>Using Olympics athlete data, we want to calculate the total number of medals won and count the appearances at the Olympic Games - once under certain conditions (female swimmers) and once the grand totals.</p><p>💡 <em>To view each node's configuration, select the node and see the configuration pane on the right side of the workflow editor.</em></p><p>We will see how to do it node by node:</p><p><strong>Excel Reader node:</strong></p><ul><li><p>Since the folder with the data is already included when you download the workflow, in the "File and Sheet" tab, we choose to "Read from" the "Current workflow data area" and select the dataset.</p></li><li><p>In the "Data Area" tab, we select to read the "Whole sheet" since the spreadsheet already has a standard format (i.e., non-empty rows or columns, etc.).</p></li></ul><p><strong>Expression Row Filter node:</strong></p><ul><li><p>This node implements the <em>IF </em>logic. In our case: Keep rows where the value for "sport" equals "Swimming" and the value for "sex" equals "Female".</p></li><li><p>We can add multiple conditions and build complex rules using the expression builder in the node configuration window.</p></li></ul><p><strong>Duplicate Row Filter node (bottom branch):</strong></p><ul><li><p>We exclude all columns that can have different values for each swimmer: "event", "medal", and "isTeamSport".</p></li><li><p>After running the node, we will have one row per athlete and Olympic edition they have participated in.</p></li></ul><p><strong>Row Aggregator node (bottom branch):</strong></p><ul><li><p>We count how often each athlete participated in the Olympics - <em>CountIf</em></p></li><li><p>As aggregation method, we use "Occurrence count" and as category column we set the athlete's "name".</p></li><li><p>The top output port contains the <em>CountIf </em>result, however, the bottom output port contains the <em>CountIf </em>grand totals which we will use later.</p></li></ul><p><strong>Row Aggregator node (top branch):</strong></p><ul><li><p>We calculate the number of medals won by each athlete - <em>SumIf</em></p></li><li><p>As aggregation method, we use "Sum", including only the "medal" column to calculate the total medals. As category column we set the athlete's "name".</p></li><li><p>The top output port contains the <em>SumIf </em>result, however, the bottom output port contains the <em>SumIf </em>grand totals which we will use later.</p></li></ul><p><strong>Value Lookup + Column Renamer + Sorter nodes (top branch):</strong></p><ul><li><p>We combine each athlete's medal count with occurrence count.</p></li><li><p>Then, we give the columns meaningful names and sort the data by appearance in descending order.</p></li></ul><p><strong>Column Appender + Column Renamer (bottom branch):</strong></p><ul><li><p>We combine the <em>SumIf </em>and <em>CountIf </em>totals and give the columns meaningful names.</p></li></ul><p><strong>Excel writer node:</strong></p><ul><li><p>We write the two tables to two separate sheets in a new file "countif_sumif_results.xlsx".</p></li><li><p>The top port writes the CountIf and SumIf results for each athlete to the sheet "countif_sumif_by_athlete", the bottom port writes the grand totals to the sheet "countif_sumif_totals".</p></li><li><p>After executing the node, the file will open automatically.</p></li></ul><p>As you can see from the output, we calculated the total medals won and appearance count for each female athlete in the discipline "Swimming" as well as the overall totals.</p>
CountIf and SumIf
This workflow shows how to perform a SumIf and a CountIf function using KNIME Analytics Platform. Of course, there are different ways how to get this done, hence, in the following example, we show you one of the possible solutions.
Using Olympics athlete data, we want to calculate the total number of medals won and count the appearances at the Olympic Games - once under certain conditions (female swimmers) and once the grand totals.
💡 To view each node's configuration, select the node and see the configuration pane on the right side of the workflow editor.
We will see how to do it node by node:
Excel Reader node:
Since the folder with the data is already included when you download the workflow, in the "File and Sheet" tab, we choose to "Read from" the "Current workflow data area" and select the dataset.
In the "Data Area" tab, we select to read the "Whole sheet" since the spreadsheet already has a standard format (i.e., non-empty rows or columns, etc.).
Expression Row Filter node:
This node implements the IF logic. In our case: Keep rows where the value for "sport" equals "Swimming" and the value for "sex" equals "Female".
We can add multiple conditions and build complex rules using the expression builder in the node configuration window.
Duplicate Row Filter node (bottom branch):
We exclude all columns that can have different values for each swimmer: "event", "medal", and "isTeamSport".
After running the node, we will have one row per athlete and Olympic edition they have participated in.
Row Aggregator node (bottom branch):
We count how often each athlete participated in the Olympics - CountIf
As aggregation method, we use "Occurrence count" and as category column we set the athlete's "name".
The top output port contains the CountIf result, however, the bottom output port contains the CountIf grand totals which we will use later.
Row Aggregator node (top branch):
We calculate the number of medals won by each athlete - SumIf
As aggregation method, we use "Sum", including only the "medal" column to calculate the total medals. As category column we set the athlete's "name".
The top output port contains the SumIf result, however, the bottom output port contains the SumIf grand totals which we will use later.
Value Lookup + Column Renamer + Sorter nodes (top branch):
We combine each athlete's medal count with occurrence count.
Then, we give the columns meaningful names and sort the data by appearance in descending order.
Column Appender + Column Renamer (bottom branch):
We combine the SumIf and CountIf totals and give the columns meaningful names.
Excel writer node:
We write the two tables to two separate sheets in a new file "countif_sumif_results.xlsx".
The top port writes the CountIf and SumIf results for each athlete to the sheet "countif_sumif_by_athlete", the bottom port writes the grand totals to the sheet "countif_sumif_totals".
After executing the node, the file will open automatically.
As you can see from the output, we calculated the total medals won and appearance count for each female athlete in the discipline "Swimming" as well as the overall totals.