A company tracks the website usage and aggregates the statistics table about each customer.
This workflow loads the raw data to and transforms it on Hive, imports the transformed data into Spark to impute missing values and aggregate (big) data on Spark, and, finally, saves the aggreagted (small) table.
To use this workflow in KNIME, download it from the below URL and open it in KNIME:
Download WorkflowDeploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!