Icon

Exercise

PART 1: Input the Titanic.table data set PART 2: Construct the data preparationsteps A-H as guided within the yellowboxes below. Once completed, convertthese steps into a metanode namedPrep Data. PART 3: Build a component that allowsa user to select from a list of threepredictive model types. Connect thesteps outlined in the red boxes below.Name the component Predictive Model. PART D: Replace missing values in theAge column with the column's median.Replace missing values in the Farecolumn with the column's mean. (1node) PART B: Use two Cell Splitter nodes to split the existing Name field so thatwe extract each person's title (Miss, Mr, Mrs, etc.).Of the new fields created by the Cell Splitters, remove all but the newlynamed Title field. (4 nodes total) PART C: For each unique Title value, count how frequently it occurs.Create a new field called Replacement whose value is "Misc" when thecount is less than or equal to 10, otherwise use the original Title value.Using a Cell Replacer node, replace the values in the Title field withthe values in the Replacement field. (3 nodes total) PART A: Convert the field pclass from anumeric to a string data type. (1 node) PART E: Create a FamilySize column thatadds parch and sibsp.Create a field named IsAlone that is 1 ifFamilySize is 0 and 0 otherwise. (2 nodes) PART G: Use the One-to-Many node to createa series of binary fields based on pclass, sex,and Title. (1 node) PART H: Rename the fields "1.0", "2.0", and "3.0" to"FirstClass", "SecondClass", and "ThirdClass", respectively. Filter columns to return only: survived, sibsp, parch,FamilySize, IsAlone, age [Binned], fare [Binned], FirstClass,SecondClass, ThirdClass, female, male, Misc, Master, Miss,Mr, and Mrs. (2 nodes total) PART F: Use the Auto-Binner node to create4 equal-width bins for the Age and Farecolumns. Use numbered bin naming. (1 node) PART D: Connect the two Case Switch Start nodes to each of the three metanodesbelow. Make sure to connect the correct output ports of each Case Switch Start to theappropriate metanode. If you followed each step correctly, the metanodes should workwith no additional effort.Finally, close the case switch with the appropriate end node. Connect this node to theComponent Output.(1 new node) PART C: Use two identical Case Switch Start nodes to split thedata set for each output of the Partitioning node. The indexvariable created by the Single Selection Configuration nodeshould control the active port. Each Case Switch should havethree output ports. (2 nodes) PART A: Use a Single SelectionConfiguration node to create a variablecalled model-type-selection. It shouldhave the following three options:Logistic RegressionDecision TreeRandom Forest(1 node) PART B: Use a Partitioning node torandomly split the data into 70% and30% groups. Set the random seed to 99.(1 node) Logistic Regression Decision Tree Random Forest PART 1: Input the Titanic.table data set PART 2: Construct the data preparationsteps A-H as guided within the yellowboxes below. Once completed, convertthese steps into a metanode namedPrep Data. PART 3: Build a component that allowsa user to select from a list of threepredictive model types. Connect thesteps outlined in the red boxes below.Name the component Predictive Model. PART D: Replace missing values in theAge column with the column's median.Replace missing values in the Farecolumn with the column's mean. (1node) PART B: Use two Cell Splitter nodes to split the existing Name field so thatwe extract each person's title (Miss, Mr, Mrs, etc.).Of the new fields created by the Cell Splitters, remove all but the newlynamed Title field. (4 nodes total) PART C: For each unique Title value, count how frequently it occurs.Create a new field called Replacement whose value is "Misc" when thecount is less than or equal to 10, otherwise use the original Title value.Using a Cell Replacer node, replace the values in the Title field withthe values in the Replacement field. (3 nodes total) PART A: Convert the field pclass from anumeric to a string data type. (1 node) PART E: Create a FamilySize column thatadds parch and sibsp.Create a field named IsAlone that is 1 ifFamilySize is 0 and 0 otherwise. (2 nodes) PART G: Use the One-to-Many node to createa series of binary fields based on pclass, sex,and Title. (1 node) PART H: Rename the fields "1.0", "2.0", and "3.0" to"FirstClass", "SecondClass", and "ThirdClass", respectively. Filter columns to return only: survived, sibsp, parch,FamilySize, IsAlone, age [Binned], fare [Binned], FirstClass,SecondClass, ThirdClass, female, male, Misc, Master, Miss,Mr, and Mrs. (2 nodes total) PART F: Use the Auto-Binner node to create4 equal-width bins for the Age and Farecolumns. Use numbered bin naming. (1 node) PART D: Connect the two Case Switch Start nodes to each of the three metanodesbelow. Make sure to connect the correct output ports of each Case Switch Start to theappropriate metanode. If you followed each step correctly, the metanodes should workwith no additional effort.Finally, close the case switch with the appropriate end node. Connect this node to theComponent Output.(1 new node) PART C: Use two identical Case Switch Start nodes to split thedata set for each output of the Partitioning node. The indexvariable created by the Single Selection Configuration nodeshould control the active port. Each Case Switch should havethree output ports. (2 nodes) PART A: Use a Single SelectionConfiguration node to create a variablecalled model-type-selection. It shouldhave the following three options:Logistic RegressionDecision TreeRandom Forest(1 node) PART B: Use a Partitioning node torandomly split the data into 70% and30% groups. Set the random seed to 99.(1 node) Logistic Regression Decision Tree Random Forest

Nodes

Extensions

Links