0 ×

07_​SparkSQL_​meets_​HiveQL

Workflow

Spark SQL meets Hive SQL
This workflow builds a line plot of the age distribution for men and women in Maine (US) over the last 5 years. In particular, women's data is processed via Hive SQL, and men's data via Spark SQL. Will they blend? The whole data set is initially read from a Hadoop Hive installation. .... and yes, Spark SQL and Hive SQL do blend!
SparkHiveHadoopBig DataSQLin-database
Spark SQL meets Hive SQL.This workflow builds a line plot of the age distribution for men and women in Maine (US) over the last 5 yearsusnig both Spark SQL and KNIME DB nodes. Hive inDB Data Manipulation - On Female Records - Remove PWGTP* & PUMA* columns - Count number of records by AGEP Spark inDB Data Manipulation - On Male Records - Remove PWGTP* & PUMA* columns - Count number of records by AGEP blend dataSELECT * FROM #table# WHERE `sex` = 1 (male)... and into KNIMECOUNT(*) FROM #table# BY AGEPfilling ageholesline plotrm puma*& pwgtp*convert a Hive queryinto a Spark RDDselect * fromss13pme tableonly Femalerecordsrm puma*& pwgtp*count recordsBY AGEP... and into KNIME Joiner Spark SQL Query Spark to Table Spark SQL Query Fix Missing Values WebPortalVisualization Spark SQL Query Hive to Spark DB Table Selector Read Data IntoLocal Spark Env DB Row Filter DB Column Filter DB GroupBy DB Reader Spark SQL meets Hive SQL.This workflow builds a line plot of the age distribution for men and women in Maine (US) over the last 5 yearsusnig both Spark SQL and KNIME DB nodes. Hive inDB Data Manipulation - On Female Records - Remove PWGTP* & PUMA* columns - Count number of records by AGEP Spark inDB Data Manipulation - On Male Records - Remove PWGTP* & PUMA* columns - Count number of records by AGEP blend dataSELECT * FROM #table# WHERE `sex` = 1 (male)... and into KNIMECOUNT(*) FROM #table# BY AGEPfilling ageholesline plotrm puma*& pwgtp*convert a Hive queryinto a Spark RDDselect * fromss13pme tableonly Femalerecordsrm puma*& pwgtp*count recordsBY AGEP... and into KNIMEJoiner Spark SQL Query Spark to Table Spark SQL Query Fix Missing Values WebPortalVisualization Spark SQL Query Hive to Spark DB Table Selector Read Data IntoLocal Spark Env DB Row Filter DB Column Filter DB GroupBy DB Reader

Download

Get this workflow from the following link: Download

Resources

Nodes

07_​SparkSQL_​meets_​HiveQL consists of the following 27 nodes(s):

Plugins

07_​SparkSQL_​meets_​HiveQL contains nodes provided by the following 5 plugin(s):