Icon

Exercise

There has been no title set for this workflow's metadata.

There has been no description set for this workflow's metadata.

PART 1: Split the input into a table with 22 columns and 110,495 rows PART 1b: Rename columns by connectingdata table to metanode. Open the metanodeto learn how this step is performed. Part 2: Calculate the total number of home runs (HR) by player. Filter this aggregation toreturn only the top-50 HR hitters.Then, use that list of 50 players to filter the main data table. Part 3: Aggregate the data to show each player's yearlyHR totals. The data table should only include threecolumns: playerID, yearID, and HR. Then, ensure thedata is sorted so that, for each player, the years are inascending order. Part 4: This section requires a Group Loop. We have set the loop up, so all you must dois provide the node(s) that go inside the loop.Calculate the cumulative sum of HR for each player throughout his career and how manyseasons he played. Rename these fields to "Cumulative HR" and "Player Season,"respectively. Part 5: Read in the table People.table. Create a new field called"Player Name" that joins the first and last name fields, separatedby a space. Filter the columns to only show playerID and PlayerName.Join in this data set to the primary table. Part 6: Pivot the data so that each row is a Player Season andeach column is a Player Name. Sum the field Cumulative HR.When pivoting, make sure the column names are the pivotnames. (That is, the column name should be something like"Player A").This data set will feed into the Line Plot (Plotly) node. You shouldnot need to change any configuration settings if you followedeach step correctly. QUESTION:Which player had the highest cumulative HR total at the end ofhis 10th season? How many HRs had he hit by that point in hiscareer?Move the black box to reveal the answer!ANSWER:Albert Pujols, with 408 home runs. Batting recordsEach iterationcovers asingle playerEnd loopCreate a line plot Prep Input Rename Columns Group Loop Start Loop End Line Plot (Plotly) PART 1: Split the input into a table with 22 columns and 110,495 rows PART 1b: Rename columns by connectingdata table to metanode. Open the metanodeto learn how this step is performed. Part 2: Calculate the total number of home runs (HR) by player. Filter this aggregation toreturn only the top-50 HR hitters.Then, use that list of 50 players to filter the main data table. Part 3: Aggregate the data to show each player's yearlyHR totals. The data table should only include threecolumns: playerID, yearID, and HR. Then, ensure thedata is sorted so that, for each player, the years are inascending order. Part 4: This section requires a Group Loop. We have set the loop up, so all you must dois provide the node(s) that go inside the loop.Calculate the cumulative sum of HR for each player throughout his career and how manyseasons he played. Rename these fields to "Cumulative HR" and "Player Season,"respectively. Part 5: Read in the table People.table. Create a new field called"Player Name" that joins the first and last name fields, separatedby a space. Filter the columns to only show playerID and PlayerName.Join in this data set to the primary table. Part 6: Pivot the data so that each row is a Player Season andeach column is a Player Name. Sum the field Cumulative HR.When pivoting, make sure the column names are the pivotnames. (That is, the column name should be something like"Player A").This data set will feed into the Line Plot (Plotly) node. You shouldnot need to change any configuration settings if you followedeach step correctly. QUESTION:Which player had the highest cumulative HR total at the end ofhis 10th season? How many HRs had he hit by that point in hiscareer?Move the black box to reveal the answer!ANSWER:Albert Pujols, with 408 home runs. Batting recordsEach iterationcovers asingle playerEnd loopCreate a line plot Prep Input Rename Columns Group Loop Start Loop End Line Plot (Plotly)

Nodes

Extensions

Links