Icon

03 Bringing Things Together

Bringing Things Together - Exercise

This workflow shows a hands-on exercise in the L1-DS Introduction to KNIME Analytics Platform for Data Scientists - Basics course

Task 1: GroupBy1. Read the adult.csv file by executing the CSV Reader node2. Calculate the total number of rows and average age by gender3. Calculate the modes of all string columns separately for each native country4. Calculate - the number of missing values in the occupation column- the number of non-missing rows in the occupation column- the number of rows in the occupation column- the number of rows in the marital-status column Task 2: Pivoting1. Read the adult_binned.csv file by executing the CSV Reader node2. Calculate the number of people in groups according to their work class and age bin3. Calculate the mode of education level in groups according to their work class and age bin Task 3: Joiner1. Read the adult_education.table and adult_income.xlsx files by executing the reader nodes2. Join the education data with the other demographics data (adult.csv). Use inner join on the ID column. 3. Join the income data with the joined table. Apply the same settings as before. Task 4: Concatenate1. Execute the Table Reader node. The joined table from the previous task contains the records for allcountries except Scotland. The records for Scotland are stored in this separate adult_scotland.table file.2. Concatenate the two tables into one Table: age-bin as a groupand workclass as a pivotCalculate:number of people in groupsTable: age-bin as a group and workclass as a pivotfind the most widespread level of education in the private workclassRead dataadult_income.xlsxRead adult.csvRead adult_binned.csvadult_scotland.tableRead adult_education.tabletotal number of rows and average age by gendermodes of all string columnsnumber of missing values in occupation columnnumber of rows: occupation, marital-statusadult.csvInner join by IDInner join by IDConcatenated the joined table with Scotland data Pivoting Pivoting Excel Reader CSV Reader CSV Reader Table Reader Table Reader GroupBy GroupBy GroupBy CSV Reader Joiner Joiner Concatenate Task 1: GroupBy1. Read the adult.csv file by executing the CSV Reader node2. Calculate the total number of rows and average age by gender3. Calculate the modes of all string columns separately for each native country4. Calculate - the number of missing values in the occupation column- the number of non-missing rows in the occupation column- the number of rows in the occupation column- the number of rows in the marital-status column Task 2: Pivoting1. Read the adult_binned.csv file by executing the CSV Reader node2. Calculate the number of people in groups according to their work class and age bin3. Calculate the mode of education level in groups according to their work class and age bin Task 3: Joiner1. Read the adult_education.table and adult_income.xlsx files by executing the reader nodes2. Join the education data with the other demographics data (adult.csv). Use inner join on the ID column. 3. Join the income data with the joined table. Apply the same settings as before. Task 4: Concatenate1. Execute the Table Reader node. The joined table from the previous task contains the records for allcountries except Scotland. The records for Scotland are stored in this separate adult_scotland.table file.2. Concatenate the two tables into one Table: age-bin as a groupand workclass as a pivotCalculate:number of people in groupsTable: age-bin as a group and workclass as a pivotfind the most widespread level of education in the private workclassRead dataadult_income.xlsxRead adult.csvRead adult_binned.csvadult_scotland.tableRead adult_education.tabletotal number of rows and average age by gendermodes of all string columnsnumber of missing values in occupation columnnumber of rows: occupation, marital-statusadult.csvInner join by IDInner join by IDConcatenated the joined table with Scotland data Pivoting Pivoting Excel Reader CSV Reader CSV Reader Table Reader Table Reader GroupBy GroupBy GroupBy CSV Reader Joiner Joiner Concatenate

Nodes

Extensions

Links