0 ×

07_​SparkSQL_​meets_​HiveQL

Workflow

Spark SQL meets Hive SQL. Age distribution for Men and Women in Maine (US) over the last 5 years.
This workflow builds a line plot of the age distribution for men and women in Maine (US) over the last 5 years. In particular, women data is processed via Hive SQL and men data via Spark SQL. Will they blend? The whole data set is initially read from a Hadoop Hive installation. .... and yes, Spark SQL and Hive SQLdo blend!
Spark Hive Hadoop Big Data SQL in-database
Spark SQL meets Hive SQL. Age distribution for Men and Women in Maine (US) over the last 5 years. This workflow builds a line plot of the age distribution for men and women in Maine (US) over the last 5 years.In particular, women data is processed via Hive SQL and men data via Spark SQL. Will they blend?The whole data set is initially read from a Hadoop Hive installation..... and yes, Spark SQL and Hive SQLdo blend! Hive inDB Data Manipulation - On Female Records - Remove PWGTP* & PUMA* columns - Count number of records by AGEP Spark inDB Data Manipulation - On Male Records - Remove PWGTP* & PUMA* columns - Count number of records by AGEP connect toHive (see instructions forHostname)count recordsBY AGEPrm puma*& pwgtp*Context is destroyed on closeselect * fromss13pme tableonly Femalerecordsblend data... and into KNIMESELECT * FROM #table# WHERE `sex` = 1 (male)... and into KNIMECOUNT(*) FROM #table# BY AGEPfilling ageholesconvert a Hive queryinto a Spark RDDline plotrm puma*& pwgtp* Hive Connector Database GroupBy DatabaseColumn Filter Create SparkContext Database TableSelector Database Row Filter Joiner Database ConnectionTable Reader Spark SQL Query Spark to Table Spark SQL Query Fix Missing Values Hive to Spark WebPortalVisualization Spark SQL Query Spark SQL meets Hive SQL. Age distribution for Men and Women in Maine (US) over the last 5 years. This workflow builds a line plot of the age distribution for men and women in Maine (US) over the last 5 years.In particular, women data is processed via Hive SQL and men data via Spark SQL. Will they blend?The whole data set is initially read from a Hadoop Hive installation..... and yes, Spark SQL and Hive SQLdo blend! Hive inDB Data Manipulation - On Female Records - Remove PWGTP* & PUMA* columns - Count number of records by AGEP Spark inDB Data Manipulation - On Male Records - Remove PWGTP* & PUMA* columns - Count number of records by AGEP connect toHive (see instructions forHostname)count recordsBY AGEPrm puma*& pwgtp*Context is destroyed on closeselect * fromss13pme tableonly Femalerecordsblend data... and into KNIMESELECT * FROM #table# WHERE `sex` = 1 (male)... and into KNIMECOUNT(*) FROM #table# BY AGEPfilling ageholesconvert a Hive queryinto a Spark RDDline plotrm puma*& pwgtp* Hive Connector Database GroupBy DatabaseColumn Filter Create SparkContext Database TableSelector Database Row Filter Joiner Database ConnectionTable Reader Spark SQL Query Spark to Table Spark SQL Query Fix Missing Values Hive to Spark WebPortalVisualization Spark SQL Query

Download

Get this workflow from the following link: Download

Resources

Nodes

07_​SparkSQL_​meets_​HiveQL consists of the following 23 nodes(s):

Plugins

07_​SparkSQL_​meets_​HiveQL contains nodes provided by the following 3 plugin(s):