Icon

Spark - Payroll Data

Processing 2.74 Million rows and 18 columns of raw data using the ApacheSpark framework. Node 1Read city-widepayroll data(Open Data)Node 3Filter Job TitleSum OT Hoursby Year and Job TitleNode 6Write pre-processeddata forvisuailzationGenerate Full NameFilter Job Titleand Full NameRemove extraspacesCount Employeesby Year and Job TItleNode 12 Create Local BigData Environment CSV Reader Table to Spark Spark Column Filter Spark GroupBy Spark to Table CSV Writer Column Expressions Spark Column Filter Column Expressions Spark GroupBy Spark Joiner Processing 2.74 Million rows and 18 columns of raw data using the ApacheSpark framework. Node 1Read city-widepayroll data(Open Data)Node 3Filter Job TitleSum OT Hoursby Year and Job TitleNode 6Write pre-processeddata forvisuailzationGenerate Full NameFilter Job Titleand Full NameRemove extraspacesCount Employeesby Year and Job TItleNode 12Create Local BigData Environment CSV Reader Table to Spark Spark Column Filter Spark GroupBy Spark to Table CSV Writer Column Expressions Spark Column Filter Column Expressions Spark GroupBy Spark Joiner

Nodes

Extensions

Links