Icon

06 Aggregations

06 Aggregations
Exercise: GroupBy1) Read the adult.csv file by executing the CSV Reader node2) Calculate the total number of rows and average age by gender3) Calculate the modes of all string columns separately for each native country4) Calculate - the number of missing values in the occupation column- the number of non-missing rows in the occupation column- the number of rows in the occupation column- the number of rows in the marital-status column Notice that the last two aggregations should provide the same numbers! Exercise: Pivoting1) Read the adult_binned.csv file by executing the CSV Reader node2) Calculate the number of people in groups according to their work class and age bin- What is the most common combination of age bin and work class?- How many people belong to this group?3) Calculate the mode of education level in groups according to their work class and age bin- What is the most widespread education level in the private workclass independently of theage bin? Most common combo of age/work class = privateaged less than 34People in this group = 10936Most widespread education level in private workclassindependent of age = 9 total rows and average age by gendernumber of missing and non-missing values in occupationand total rows in occupation/marital-statustable:* age-bin as a group*workclass as a pivot *calculate the number of people in groupsmodes of string columns for each native countrytable: * age-bin as a group* workclass as a pivot * find the most widespread level of education in the private workclassRead adult.csvRead adult_binned.csvGroupBy GroupBy Pivoting GroupBy Pivoting CSV Reader CSV Reader Exercise: GroupBy1) Read the adult.csv file by executing the CSV Reader node2) Calculate the total number of rows and average age by gender3) Calculate the modes of all string columns separately for each native country4) Calculate - the number of missing values in the occupation column- the number of non-missing rows in the occupation column- the number of rows in the occupation column- the number of rows in the marital-status column Notice that the last two aggregations should provide the same numbers! Exercise: Pivoting1) Read the adult_binned.csv file by executing the CSV Reader node2) Calculate the number of people in groups according to their work class and age bin- What is the most common combination of age bin and work class?- How many people belong to this group?3) Calculate the mode of education level in groups according to their work class and age bin- What is the most widespread education level in the private workclass independently of theage bin? Most common combo of age/work class = privateaged less than 34People in this group = 10936Most widespread education level in private workclassindependent of age = 9 total rows and average age by gendernumber of missing and non-missing values in occupationand total rows in occupation/marital-statustable:* age-bin as a group*workclass as a pivot *calculate the number of people in groupsmodes of string columns for each native countrytable: * age-bin as a group* workclass as a pivot * find the most widespread level of education in the private workclassRead adult.csvRead adult_binned.csvGroupBy GroupBy Pivoting GroupBy Pivoting CSV Reader CSV Reader

Nodes

Extensions

Links