Icon

03_​Control_​Workflow_​for_​Performance_​and_​Scalability_​Measurements

Control Workflow for Performance and Scalability Measurements

This workflow was constructed to be able to compare different performance and scalability measurements.

First a data set of the defined size is created.
This dataset is sent forward to multiple workflows. We are here comparing three modes, using Native KNIME and adding Spark or H2O nodes. Finally these measurements are repeated mutliple times to ensure that the results are independent from effects outside of the workflow.

Generate Context to remove time fromoverall time measurement. Control Workflow for Performance and Scalability MeasurementsThis workflow was constructed to be able to compare different performance and scalability measurements. First a data set of the defined size is created.This dataset is sent forward to multiple workflows. We are here comparing three modes, using Native KNIME and adding Spark or H2O nodes. Finally thesemeasurements are repeated mutliple times to ensure that the results are independent from effects outside of the workflow. Define Data Table Size Define Number of ValidationsBy using multiple validations youreduce effects from outside of KNIME Execute in each of the three worlds Automatically create the datatable withdefined row sizeexecute workflowAll possibleexecution modes(Native KNIME, H2O andSpark)For each modeCollect speedmeasurementsfor each modeStart H2OLocal ContextGeneratecorrect filepathGenerate Test Data Tables For each defined data sizefor eachvalidationNumberof Validationsfor each defined data sizeInitalize LocalBig Data EnvironmentEnsures that memory is empty Call Workflow(Table Based) Call Workflow(Table Based) Table Creator Table Row ToVariable Loop Start Loop End H2O Local Context String Manipulation Table Creator Table Row ToVariable Loop Start Loop End Counting Loop Start Loop End Create Local BigData Environment Run GarbageCollector Variable toTable Column Generate Context to remove time fromoverall time measurement. Control Workflow for Performance and Scalability MeasurementsThis workflow was constructed to be able to compare different performance and scalability measurements. First a data set of the defined size is created.This dataset is sent forward to multiple workflows. We are here comparing three modes, using Native KNIME and adding Spark or H2O nodes. Finally thesemeasurements are repeated mutliple times to ensure that the results are independent from effects outside of the workflow. Define Data Table Size Define Number of ValidationsBy using multiple validations youreduce effects from outside of KNIME Execute in each of the three worlds Automatically create the datatable withdefined row sizeexecute workflowAll possibleexecution modes(Native KNIME, H2O andSpark)For each modeCollect speedmeasurementsfor each modeStart H2OLocal ContextGeneratecorrect filepathGenerate Test Data Tables For each defined data sizefor eachvalidationNumberof Validationsfor each defined data sizeInitalize LocalBig Data EnvironmentEnsures that memory is empty Call Workflow(Table Based) Call Workflow(Table Based) Table Creator Table Row ToVariable Loop Start Loop End H2O Local Context String Manipulation Table Creator Table Row ToVariable Loop Start Loop End Counting Loop Start Loop End Create Local BigData Environment Run GarbageCollector Variable toTable Column

Nodes

Extensions

Links