0 ×

Spark - Payroll Data

Workflow

Processing 2.74 Million rows and 18 columns of raw data using the ApacheSpark framework. Node 1Read city-widepayroll data(Open Data)Node 3Filter Job TitleSum OT Hoursby Year and Job TitleNode 6Write pre-processeddata forvisuailzationGenerate Full NameFilter Job Titleand Full NameRemove extraspacesCount Employeesby Year and Job TItleNode 12 Create Local BigData Environment CSV Reader Table to Spark Spark Column Filter Spark GroupBy Spark to Table CSV Writer Column Expressions Spark Column Filter Column Expressions Spark GroupBy Spark Joiner Processing 2.74 Million rows and 18 columns of raw data using the ApacheSpark framework. Node 1Read city-widepayroll data(Open Data)Node 3Filter Job TitleSum OT Hoursby Year and Job TitleNode 6Write pre-processeddata forvisuailzationGenerate Full NameFilter Job Titleand Full NameRemove extraspacesCount Employeesby Year and Job TItleNode 12Create Local BigData Environment CSV Reader Table to Spark Spark Column Filter Spark GroupBy Spark to Table CSV Writer Column Expressions Spark Column Filter Column Expressions Spark GroupBy Spark Joiner

Download

Get this workflow from the creator’s website: Link

Nodes

Spark - Payroll Data consists of the following 12 nodes(s):

Plugins

Spark - Payroll Data contains nodes provided by the following 3 plugin(s):