Icon

07_​Data_​Aggregation

Exercise 7: Data AggregationIn this exercise you generate new columns by aggregating data by groups and create pivot tables that show aggregated values by two or more groups 2a) Calculate the total purchase amount by a quarter and a transaction type (Pivoting)2b) Set missing numeric values to zero2c) Combine the year and quarter information to a single column. Use "/" as separator. (String Manipulation node) 1a) Calculate the total purchase amount by a customer ID both in 2019 and earlier 1.1 Calculate the sum of product prizes by a customer ID in the data for purchases in 2019 (GroupBy node) 1.2 Rename the aggregated column to "Basket Sum 2019" 1.3 Calculate the sum of product prizes by a customer ID concerning the purchases earlier than 2019. Rename theaggregated column to "Basket Sum Overall" 1.4 Join the aggregated columns on the CustomerID1b) Remove rows with missing string values and set missing doubles to zero 3a) (optional) Calculate the numbers of orders by a basket size and a transaction type 3.1 Calculate the number of purchased products by an order number. Retain the information on the transaction type. 3.2 Create a pivot table that shows the numbers of orders by a basket size and a transaction type3b) (optional) Set missing integers to zero3c) (optional) Calculate the proportions of the different basket sizes among each transaction type 4) (optional) Convert the dates of births of thecustomers to Date&Time and extract the birth yearinto a separate column Extract QuarterAnd YearInteger: 0Double: 0.0Transform Transaction Date into Date&Time FomatBasketSumper CustomerBasketSumper Customeron Customer IDString: remove rowDouble: 0.0TransformBirthdayExtract Birth YearInteger: 0Share of column total"Year/Quarter"Top: 2019Bottom: Earlier than 2019 Extract Date&TimeFields Pivoting Missing Value String to Date&Time GroupBy GroupBy Joiner Missing Value String to Date&Time Extract Date&TimeFields Column Rename Column Rename GroupBy Pivoting Missing Value Math Formula(Multi Column) String Manipulation Partition by time Read joined andpreprocessed data Exercise 7: Data AggregationIn this exercise you generate new columns by aggregating data by groups and create pivot tables that show aggregated values by two or more groups 2a) Calculate the total purchase amount by a quarter and a transaction type (Pivoting)2b) Set missing numeric values to zero2c) Combine the year and quarter information to a single column. Use "/" as separator. (String Manipulation node) 1a) Calculate the total purchase amount by a customer ID both in 2019 and earlier 1.1 Calculate the sum of product prizes by a customer ID in the data for purchases in 2019 (GroupBy node) 1.2 Rename the aggregated column to "Basket Sum 2019" 1.3 Calculate the sum of product prizes by a customer ID concerning the purchases earlier than 2019. Rename theaggregated column to "Basket Sum Overall" 1.4 Join the aggregated columns on the CustomerID1b) Remove rows with missing string values and set missing doubles to zero 3a) (optional) Calculate the numbers of orders by a basket size and a transaction type 3.1 Calculate the number of purchased products by an order number. Retain the information on the transaction type. 3.2 Create a pivot table that shows the numbers of orders by a basket size and a transaction type3b) (optional) Set missing integers to zero3c) (optional) Calculate the proportions of the different basket sizes among each transaction type 4) (optional) Convert the dates of births of thecustomers to Date&Time and extract the birth yearinto a separate column Extract QuarterAnd YearInteger: 0Double: 0.0Transform Transaction Date into Date&Time FomatBasketSumper CustomerBasketSumper Customeron Customer IDString: remove rowDouble: 0.0TransformBirthdayExtract Birth YearInteger: 0Share of column total"Year/Quarter"Top: 2019Bottom: Earlier than 2019 Extract Date&TimeFields Pivoting Missing Value String to Date&Time GroupBy GroupBy Joiner Missing Value String to Date&Time Extract Date&TimeFields Column Rename Column Rename GroupBy Pivoting Missing Value Math Formula(Multi Column) String Manipulation Partition by time Read joined andpreprocessed data

Nodes

Extensions

Links