Icon

EX2 Data Blending - exercise

<p>Starting from different data sources this workflow learns you how to blend data (extract value from multiple data sources) and how to make efficient ETL (Extract, Transform and Loading) with KNIME Analytics Platform.</p>

Data Discovery & Structuring

6 data sources:

- CSV file - Webdata Old System.csv (Old Web System)

- SQLite Database - WebActivity.sqlite (New Web System)

- .table file - Sentiment Analysis.table

- TSV file - Sentiment Rating.tsv

- CSV file - Demographics.csv

- Excel file - Product Data2.xls

The files are located in UCLL Data Management/Data/EX2 Data Blending/

Visualize

Create a bar chart showing the average customer age for each product

Data Transformation

- Concatenate web activity data from the old and new systems into one table

- Replace sentiment labels (text) in the Sentiment Analysis with the numeric scores from the Sentiment Rating

- Set all product names to lowercase in the product data

- Join all data together by the customerKey column

  1. first, join web activity data with sentiment data

  2. next, join the newly formed resultset from step 1 with demographics data

  3. finally, join the newly formed resultset from step 2 with product data

- Group the final resultset from step 3 by products, and aggregate by age (mean)

Business Question: What is the average customer age for each product? Visualise the result in a bar chart.

Tasks

Analyse and Profile data sources.

Blend together various data sources in an effective data structuring and transformation process.

Visualize the final result.

Document your flow.

Export your KNIME flow, rename the exported file with TAAK1_your name. Upload your file in Toledo (TAAK1).

Nodes

  • No nodes found

Extensions

  • No modules found

Links