Icon

Exercise

PART 1: Using the Works data table, find a configuration node that creates a drop-downlist of values from the GenreType column. Connect this configuration node to a RowFilter and filter the data set according to the user's selection.-- Select "Comedy" to more easily follow the steps of this problem. PART 2: Using the Paragraphs data table, we need to split the PlainText column so we can count how many wordsare in each record. You will need to split the field twice, first by spaces and then by a newline character (see the hintbelow if you do not know how to designate a newline character). Once the PlainText field is separated into a one-word-per-row structure, summarize the data, grouping by chapter_idand character_id and counting the number of words. After splitting and before summarizing, you should have 894,201records. After summarizing, you should have 4,876 records. PART 3: Join in the Chapters data table to thesummarized word count from the Part 2. Removeduplicate chapter_id column. You should have 4,876rows and 7 columns. PART 4: Join the results from Part 3 with the results from Part 1 using work_id. Summarize theresults to obtain the total word count for each work. Create a variable for the work_id with thelowest total word count.For the Comedy genre, which work is converted into a variable? PART 5: Using the Characters table, perform a series of joins to determine the work and work_idvalues for each character. The resulting table will need to be aggregated to remove duplicatesand sum each character's word count. It should include all columns from the Characters tableplus work_id and the word count. You should have 1,331 rows and 6 columns. PART 6: Filter the results from Part 5 using the variable created in Part 4. This will create a table of word counts bycharacter for the dynamically chosen work. Using a variable expression, create a string that uses the followingstructure (copy/paste):join("../",variable("Title"), "_CharacterWordCount.table")Next, convert this string into a path variable. You MUST select "Relative to" and "Current workflow" in the nodeconfiguration menu. Finally, use the Table Writer to generate an output table in the location designated by the stringabove. Hint: The newline character is denoted by \n. In the Cell Splitter node, make sure you check the box that reads "Use \ asescape character." worksparagraphschapterscharacters Table Reader Table Reader Table Reader Table Reader PART 1: Using the Works data table, find a configuration node that creates a drop-downlist of values from the GenreType column. Connect this configuration node to a RowFilter and filter the data set according to the user's selection.-- Select "Comedy" to more easily follow the steps of this problem. PART 2: Using the Paragraphs data table, we need to split the PlainText column so we can count how many wordsare in each record. You will need to split the field twice, first by spaces and then by a newline character (see the hintbelow if you do not know how to designate a newline character). Once the PlainText field is separated into a one-word-per-row structure, summarize the data, grouping by chapter_idand character_id and counting the number of words. After splitting and before summarizing, you should have 894,201records. After summarizing, you should have 4,876 records. PART 3: Join in the Chapters data table to thesummarized word count from the Part 2. Removeduplicate chapter_id column. You should have 4,876rows and 7 columns. PART 4: Join the results from Part 3 with the results from Part 1 using work_id. Summarize theresults to obtain the total word count for each work. Create a variable for the work_id with thelowest total word count.For the Comedy genre, which work is converted into a variable? PART 5: Using the Characters table, perform a series of joins to determine the work and work_idvalues for each character. The resulting table will need to be aggregated to remove duplicatesand sum each character's word count. It should include all columns from the Characters tableplus work_id and the word count. You should have 1,331 rows and 6 columns. PART 6: Filter the results from Part 5 using the variable created in Part 4. This will create a table of word counts bycharacter for the dynamically chosen work. Using a variable expression, create a string that uses the followingstructure (copy/paste):join("../",variable("Title"), "_CharacterWordCount.table")Next, convert this string into a path variable. You MUST select "Relative to" and "Current workflow" in the nodeconfiguration menu. Finally, use the Table Writer to generate an output table in the location designated by the stringabove. Hint: The newline character is denoted by \n. In the Cell Splitter node, make sure you check the box that reads "Use \ asescape character." worksparagraphschapterscharacters Table Reader Table Reader Table Reader Table Reader

Nodes

Extensions

Links