Icon

01.Parallel_​Executions - Solution

Call Sentiment Calculator over Large Datasets

In this exercise you'll practice parallelizing an application to optimize its runtime. The application predicts the sentiment of a huge amount of tweets.

Input IngestionHere we read a large sample of tweets forsentiment prediction. Parallelization makes itfaster. Step 1. Invoke Sentiment Predictor1. Drag the Call Workflow Service node from the node repository to this part of the workflow. 2. When configuring the Call Workflow Service node, set parameter 'Workflow Path' as '../Callee_Applications/04.Callee_Deploying_Sentiment_Predictor_-_Lexicon_Based'. Select 'Adjust node ports' and clock'OK'. Connect the output port of the CSV Reader node to its input port.3. Click on the top right of the Call Workflow Service node and drag the red line that appears until it connects to the input port of the Timer info node. You're connecting a flow variable from the former node to theflow variable input port of the latter. Execute the workflow and see how much time it takes to run it (in ms).Step 2. Parallelize the Execution. Here we wrap the called workflow into a loop that parallelizes its execution.1. Drag nodes Parallel Chunk Start and Parallel Chunk End from the node repository to this part of the workflow. Connect the output port of the CSV Reader node to the input port of the Parallel Chunk Startnode. 2. When configuring the Parallel Chunk Start node, select 'Use automatic chunk count'. When configuring the Parallel Chunk End node, select 'Add Chunk Index to Row ID'.3. Connect the output port of the Parallel Chunk Start node to the input port of the Call Workflow Service node. Connect the output port of the Call Workflow Service node to the input port of the Parallel ChunkEnd node.4. Click on the top right of the Parallel Chunk End node and drag the line that appears until it connects to the input port of the Timer info node. Execute the workflow now and see how much time it takes to run it (inms). It should be significantly faster than the previous execution from Step 1. Session 4 - Optimization, Orchestration and Best PracticesExercise 01.Parallel_ExecutionsIn this exercise you'll practice parallelizing an application to optimize its runtime. The application predicts the sentiment of a huge amount of tweets. Check how much time (ms)it takes without parallelizationRead large dataset Timer Info CSV Reader Call WorkflowService ParallelChunk Start Parallel Chunk End Input IngestionHere we read a large sample of tweets forsentiment prediction. Parallelization makes itfaster. Step 1. Invoke Sentiment Predictor1. Drag the Call Workflow Service node from the node repository to this part of the workflow. 2. When configuring the Call Workflow Service node, set parameter 'Workflow Path' as '../Callee_Applications/04.Callee_Deploying_Sentiment_Predictor_-_Lexicon_Based'. Select 'Adjust node ports' and clock'OK'. Connect the output port of the CSV Reader node to its input port.3. Click on the top right of the Call Workflow Service node and drag the red line that appears until it connects to the input port of the Timer info node. You're connecting a flow variable from the former node to theflow variable input port of the latter. Execute the workflow and see how much time it takes to run it (in ms).Step 2. Parallelize the Execution. Here we wrap the called workflow into a loop that parallelizes its execution.1. Drag nodes Parallel Chunk Start and Parallel Chunk End from the node repository to this part of the workflow. Connect the output port of the CSV Reader node to the input port of the Parallel Chunk Startnode. 2. When configuring the Parallel Chunk Start node, select 'Use automatic chunk count'. When configuring the Parallel Chunk End node, select 'Add Chunk Index to Row ID'.3. Connect the output port of the Parallel Chunk Start node to the input port of the Call Workflow Service node. Connect the output port of the Call Workflow Service node to the input port of the Parallel ChunkEnd node.4. Click on the top right of the Parallel Chunk End node and drag the line that appears until it connects to the input port of the Timer info node. Execute the workflow now and see how much time it takes to run it (inms). It should be significantly faster than the previous execution from Step 1. Session 4 - Optimization, Orchestration and Best PracticesExercise 01.Parallel_ExecutionsIn this exercise you'll practice parallelizing an application to optimize its runtime. The application predicts the sentiment of a huge amount of tweets. Check how much time (ms)it takes without parallelizationRead large dataset Timer Info CSV Reader Call WorkflowService ParallelChunk Start Parallel Chunk End

Nodes

Extensions

Links