Icon

02_​ETL_​Basics

Example Workflow for ETL Basics Operations

In this workflow, a number of ETL operations are performed on the sales2008-2011.csv dataset. Besides showing what ETL features are, the goal of this workflow is to move from a series of contracts with different customers in different countries to a one-row summary description for each one of the customers.
The one-row description includes:
1. the customer unique ID
2. the total amount of money payed by the customer to the company
3. the countries the customer has been active in
4. the date of the first contract (this is always useful to estimate the customer loyalty)
5. the number of days between the first and the last purchase, that is the number of days the customer has been with the company
At the end, each one-row customer summary information is joined together with each contract data row from the original file and the resulting table is written to a CSV file in a "data" folder located in the workflow folder.

Nodes

Extensions

Links