Icon

01 ETL on new customer data - extract, transform, and load S3 and web service data

<p>The company has two branches: one in the United States and another one in Europe. New US customer data comes from a web service. New Europe customer data comes from Amazon S3.</p><p>We access, validate, prepare, and blend the data, anonymize it, and append it to the database table "customers".</p>

Extract

Exercise workflow 01

ETL on new customer data - extract, transform, and load S3 and web service data

Transform

Load

Step 3. Data Anonymization

Open the component to find the instructions


Step 4. Update the database

Open the component to find the next instructions


Step 1. Access from the Amazon S3 data lake, validate, clean, and standardize the Europe customers data

  1. Open the component to find the instructions


Step 2. Access from the web service, validate, clean, and standardize the US customers data

  1. Provide the web service URL.

  2. Open the component to find the instructions


Step 5. Import the credentials

Import flow variables from the caller (Workflow Service Input, flow variable ports)


Step 6.1. Handle errors

Initiate the Try-Catch construction:

  • Connect the Try (Variable Ports) node with Workflow Service Input and the both Data Access components


Step 6.2 Handle errors

  1. Close the Try-Catch construction to output an integer flow variable called ETL_failed indicating either a successful ETL process execution (=0) or a failure (=1)

    1. Add two Variable Creator nodes:

      • The 1st one should create a variable ETL_failed = 0

      • The 2nd one - a variable ETL_failed = 1

    2. Connect the Database Update component to the 1st Variable Creator node

    3. Provide the variable ETL_failed = 0 to the upper port of the Catch Errors (Var Ports) node and ETL_failed = 1 to the bottom port of the Catch Errors (Var Ports) node


Step 7. Export workflow status

  1. Filter out all the variables except ETL_failed, FailingNodeMessage, and FailingNode to the calling workflow (Variable Filter node)

  2. Export the mentioned variables to the caller workflow (Workflow Service Output node, flow variable ports)


Step 0. Authenticate

  • Provide the username and password to access the Amazon S3 bucket and the web service


* NOTE. Follow the instructions of the trainers.

If you don't have access to the credentials (e.g. for self-paced course users), use the data provided in the components.

Step 8. OPTIONAL

You can delete the Retrieve credentials component from this workflow since the credentials will be imported from the orchestration workflow. On the other hand, you can keep it here for debugging.

Learning objective: Learn to access and validate data from various sources, anonymize it, and update a database.


Workflow description: The company has two branches: one in the United States and another one in Europe. New US customer data comes from a web service. New Europe customer data comes from Amazon S3. We access, validate, prepare, and blend the data, anonymize it, and append it to the database table "customers".


You'll find the instructions to the exercises in the yellow annotations.

Blend the data from different sources
Concatenate
Execute up-stream before configuration
Database Update
Data Access Europe Customers
Catch Errors (Var Ports)
Data Access US Customers
Retrieve credentials
Status: failure
Variable Creator
Execute up-stream before configuration
Data Anonymization
Status: success
Variable Creator
Workflow execution status
Workflow Output
Try (Variable Ports)
Variable Filter
Import credentials from the caller
Workflow Input

Nodes

Extensions

Links