Icon

JustKnimeItChallenge-3

JustKnimeItChallenge-3
Just KNIME It Challenge 3: CDC Cancer DataDataset: https://hub.knime.com/alinebessa/spaces/Just%20KNIME%20It!%20Datasets/latest/Challenge%203%20-%20Datasets~pZAJOPBXtHiXnRhq/Problem Statement: You received the 2017 cancer data from the CDC for inspection, and your goal is to answer the following questions: (1) What are the top-5 most frequent cancer types occurring in females? (2)What are the top-5 most frequent cancer types occurring in males? (3) Which US state has the highest cancer incidence rate (that is, the highest number of cancer cases normalized by the size of its population)?Solution: This workflow is created to achieve the above mentioned tasks with the following steps: 1. Initially the state columns in both dataset have been cleaned using the string manipulation node. 2. The missingrecords have been removed from the CDC_cancer_2017.csv file. 3. Joined both the datasets. 4. Aggregation done using groupby node to achieve the task. 5. Finally a dashboard is created for the requestedquestions. CDC_cancer_2017.csvpopulation2017.xlsxcleanUpStateColumncleanUpStateColumnremoveMissingAndUnnecessaryRecordsleftJoingroupBySexCodeAndCancerSitetop5columnFilterrankColumn_question 1sortedByCountrankColumn_question 2top5sortedByCountcolumnFiltertotalCountPerState%ageOfPopulationtop1columnFilterrankColumn_question 3CSV Reader Excel Reader String Manipulation String Manipulation Rule-basedRow Filter Joiner GroupBy Top k Selector Column Filter Math Formula Sorter Math Formula Top k Selector Sorter Column Filter GroupBy Math Formula Top k Selector Column Filter Math Formula Dashboard Just KNIME It Challenge 3: CDC Cancer DataDataset: https://hub.knime.com/alinebessa/spaces/Just%20KNIME%20It!%20Datasets/latest/Challenge%203%20-%20Datasets~pZAJOPBXtHiXnRhq/Problem Statement: You received the 2017 cancer data from the CDC for inspection, and your goal is to answer the following questions: (1) What are the top-5 most frequent cancer types occurring in females? (2)What are the top-5 most frequent cancer types occurring in males? (3) Which US state has the highest cancer incidence rate (that is, the highest number of cancer cases normalized by the size of its population)?Solution: This workflow is created to achieve the above mentioned tasks with the following steps: 1. Initially the state columns in both dataset have been cleaned using the string manipulation node. 2. The missingrecords have been removed from the CDC_cancer_2017.csv file. 3. Joined both the datasets. 4. Aggregation done using groupby node to achieve the task. 5. Finally a dashboard is created for the requestedquestions. CDC_cancer_2017.csvpopulation2017.xlsxcleanUpStateColumncleanUpStateColumnremoveMissingAndUnnecessaryRecordsleftJoingroupBySexCodeAndCancerSitetop5columnFilterrankColumn_question 1sortedByCountrankColumn_question 2top5sortedByCountcolumnFiltertotalCountPerState%ageOfPopulationtop1columnFilterrankColumn_question 3CSV Reader Excel Reader String Manipulation String Manipulation Rule-basedRow Filter Joiner GroupBy Top k Selector Column Filter Math Formula Sorter Math Formula Top k Selector Sorter Column Filter GroupBy Math Formula Top k Selector Column Filter Math Formula Dashboard

Nodes

Extensions

Links