Icon

01.Parallel_​Executions - Exercise

Call Sentiment Calculator over Large Datasets - Exercise

In this exercise you'll practice parallelizing an application to optimize its runtime. The application predicts the sentiment of a huge amount of tweets.

Input IngestionHere we read a large sample oftweets for sentiment prediction.Parallelization makes it faster. Step 1. Invoke Sentiment Predictor1. Drag the Call Workflow Service node from the node repository to this part of the workflow. 2. When configuring the Call Workflow Service node, set the workflow relative path to '../Callee_Applications/04.Callee_Deploying_Sentiment_Predictor_-_Lexicon_Based'. Select 'Adjust node ports' and clock 'OK'. Connect the output port of the CSV Reader node to its input port.3. Click on the top right of the Call Workflow Service node and drag the red line that appears until it connects to the input port of the Timer info node. You'reconnecting a flow variable from the former node to the flow variable input port of the latter. Execute the workflow and see how much time it takes to run it (in ms).Step 2. Parallelize the Execution. Here we wrap the called workflow into a loop that parallelizes its execution.1. Drag nodes Parallel Chunk Start and Parallel Chunk End from the node repository to this part of the workflow. Connect the output port of the CSV Readernode to the input port of the Parallel Chunk Start node. 2. When configuring the Parallel Chunk Start node, select 'Use automatic chunk count'. When configuring the Parallel Chunk End node, select 'Add ChunkIndex to Row ID'.3. Connect the output port of the Parallel Chunk Start node to the input port of the Call Workflow Service node. Connect the output port of the Call WorkflowService node to the input port of the Parallel Chunk End node.4. Click on the top right of the Parallel Chunk End node and drag the line that appears until it connects to the input port of the Timer info node. Execute theworkflow now and see how much time it takes to run it (in ms). It should be significantly faster than the previous execution from Step 1. Session 4 - Optimization, Orchestration and Best PracticesExercise 01.Parallel_ExecutionsIn this exercise you'll practice parallelizing an application to optimize its runtime. The application predicts the sentiment of a huge amount of tweets. Check how much time (ms)it takes without parallelizationRead large dataset Timer Info CSV Reader Input IngestionHere we read a large sample oftweets for sentiment prediction.Parallelization makes it faster. Step 1. Invoke Sentiment Predictor1. Drag the Call Workflow Service node from the node repository to this part of the workflow. 2. When configuring the Call Workflow Service node, set the workflow relative path to '../Callee_Applications/04.Callee_Deploying_Sentiment_Predictor_-_Lexicon_Based'. Select 'Adjust node ports' and clock 'OK'. Connect the output port of the CSV Reader node to its input port.3. Click on the top right of the Call Workflow Service node and drag the red line that appears until it connects to the input port of the Timer info node. You'reconnecting a flow variable from the former node to the flow variable input port of the latter. Execute the workflow and see how much time it takes to run it (in ms).Step 2. Parallelize the Execution. Here we wrap the called workflow into a loop that parallelizes its execution.1. Drag nodes Parallel Chunk Start and Parallel Chunk End from the node repository to this part of the workflow. Connect the output port of the CSV Readernode to the input port of the Parallel Chunk Start node. 2. When configuring the Parallel Chunk Start node, select 'Use automatic chunk count'. When configuring the Parallel Chunk End node, select 'Add ChunkIndex to Row ID'.3. Connect the output port of the Parallel Chunk Start node to the input port of the Call Workflow Service node. Connect the output port of the Call WorkflowService node to the input port of the Parallel Chunk End node.4. Click on the top right of the Parallel Chunk End node and drag the line that appears until it connects to the input port of the Timer info node. Execute theworkflow now and see how much time it takes to run it (in ms). It should be significantly faster than the previous execution from Step 1. Session 4 - Optimization, Orchestration and Best PracticesExercise 01.Parallel_ExecutionsIn this exercise you'll practice parallelizing an application to optimize its runtime. The application predicts the sentiment of a huge amount of tweets. Check how much time (ms)it takes without parallelizationRead large dataset Timer Info CSV Reader

Nodes

Extensions

Links