Icon

03_​Spark_​Modelling

03 Spark Modelling Exercise
Missing Values Strategy: 03_Spark_Modelling This workflow implements a predictor with Spark for the COW class based on the data rows with no missing COW values from the ss13pme data set.The workflow 1. reads the ss13pme table from Hive into Spark, 2. filters out uninteresting columns, 3. separates rows where COW is not null from rows where COW is null, and 4. where cow is not null: modifies cow values to be zero-basedNow to do: 1. Where COW is not null: fix missing values and train a decision tree, and 2. Where COW is null: fix missing values, then apply decision tree to predict COW Make sure you have executed the /2_Hadoop/2_Exercises/00_Setup_Hive_Table workflow during your current KNIME session before running this workflow. Connect to Local Big DataEnvironmentStart cow class from 0rm puma*& pwgtp*COW is NOT NULLCOW is NULLrm cowselect * fromss13pme tableCreate Local BigData Environment Modify cow class Spark Column Filter Spark Row Filter Spark Row Filter Spark Column Filter DB Table Selector Hive to Spark Missing Values Strategy: 03_Spark_Modelling This workflow implements a predictor with Spark for the COW class based on the data rows with no missing COW values from the ss13pme data set.The workflow 1. reads the ss13pme table from Hive into Spark, 2. filters out uninteresting columns, 3. separates rows where COW is not null from rows where COW is null, and 4. where cow is not null: modifies cow values to be zero-basedNow to do: 1. Where COW is not null: fix missing values and train a decision tree, and 2. Where COW is null: fix missing values, then apply decision tree to predict COW Make sure you have executed the /2_Hadoop/2_Exercises/00_Setup_Hive_Table workflow during your current KNIME session before running this workflow. Connect to Local Big DataEnvironmentStart cow class from 0rm puma*& pwgtp*COW is NOT NULLCOW is NULLrm cowselect * fromss13pme tableCreate Local BigData Environment Modify cow class Spark Column Filter Spark Row Filter Spark Row Filter Spark Column Filter DB Table Selector Hive to Spark

Nodes

Extensions

Links