Icon

Combining Data Sets Practice

There has been no title set for this workflow's metadata.

There has been no description set for this workflow's metadata.

Combining Data PracticeThe African Socio Economic Database is an annual collection of various indicators across the African continent. Indicators are aggregated atthe country, regional, and continent-wide levels. These indicators include such things as the number of tourist arrivals, hydroelectric energyconsumption, and the annual rate of reforestation.The database includes four files:Data 2000-2009.csv and Data 2010-2019.csv: The tables that contain all the measure information. They are in a “tall and skinny” format,which will require reshaping.Country Codes.csv: A table that contains a mapping between country names and codes.Indicators.csv: A table that contains a list of all indicators and their codes.These files are found in the Combining Data Sets Practice folder.The goal of this exercise is to save a CSV file that satisfies the following requirements:• Utilizes these three Code and Indicator_Id values • NY.GDP.PCAP.KD 732 • SP.URB.TOTL 686 • SP.POP.TOTL 688• Utilizes records for individual countries only (i.e. no regional affiliations). A table is provided for use.• Combine the files Data 2000-2009 and Data 2010-2019 into a single table.• Using the above indicators and countries, filter the combined Data tables to show only the desired records.• Extract the year from the Date field.• Reshape the data so each record represents a unique country and year combination. Indicators should be represented as columns.• Calculate the proportion of each record’s population that is urban.• Determine the average urban proportion for each year in the data set• Relate each country and year with each year’s average urban proportion.• If a record’s urban proportion is greater than that year’s average, create a calculation that returns “Above Average.” Otherwise, return “BelowAverage.”• Finally, output a CSV file of the table.There are two versions of this practice problem:1) You are only given the input and output data sets. You must figure out which nodes to use, how to configure them, and how to organize yourworkflow in order to generate the desired output.2) In addition to the input and output data sets, we have provided an outline of which nodes to use at which steps to guide you along. Note thatthis is only one possible solution, as there are many ways a problem can be solved in KNIME. INPUT DATAData Tables OUTPUT DATA INPUT DATAList of Countries INPUT DATAList of Indicators List of countries we wantSolution Table Creator CSV Reader Combining Data PracticeThe African Socio Economic Database is an annual collection of various indicators across the African continent. Indicators are aggregated atthe country, regional, and continent-wide levels. These indicators include such things as the number of tourist arrivals, hydroelectric energyconsumption, and the annual rate of reforestation.The database includes four files:Data 2000-2009.csv and Data 2010-2019.csv: The tables that contain all the measure information. They are in a “tall and skinny” format,which will require reshaping.Country Codes.csv: A table that contains a mapping between country names and codes.Indicators.csv: A table that contains a list of all indicators and their codes.These files are found in the Combining Data Sets Practice folder.The goal of this exercise is to save a CSV file that satisfies the following requirements:• Utilizes these three Code and Indicator_Id values • NY.GDP.PCAP.KD 732 • SP.URB.TOTL 686 • SP.POP.TOTL 688• Utilizes records for individual countries only (i.e. no regional affiliations). A table is provided for use.• Combine the files Data 2000-2009 and Data 2010-2019 into a single table.• Using the above indicators and countries, filter the combined Data tables to show only the desired records.• Extract the year from the Date field.• Reshape the data so each record represents a unique country and year combination. Indicators should be represented as columns.• Calculate the proportion of each record’s population that is urban.• Determine the average urban proportion for each year in the data set• Relate each country and year with each year’s average urban proportion.• If a record’s urban proportion is greater than that year’s average, create a calculation that returns “Above Average.” Otherwise, return “BelowAverage.”• Finally, output a CSV file of the table.There are two versions of this practice problem:1) You are only given the input and output data sets. You must figure out which nodes to use, how to configure them, and how to organize yourworkflow in order to generate the desired output.2) In addition to the input and output data sets, we have provided an outline of which nodes to use at which steps to guide you along. Note thatthis is only one possible solution, as there are many ways a problem can be solved in KNIME. INPUT DATAData Tables OUTPUT DATA INPUT DATAList of Countries INPUT DATAList of Indicators List of countries we wantSolution Table Creator CSV Reader

Nodes

Extensions

Links