Icon

04_​Parameter_​Optimization_​in_​Spark

Mix and match Spark nodes with other KNIME nodes

This workflow mixes standard KNIME nodes with the Spark nodes to find the optimal parameters for a k-means clustering using the hillclimbing approach. Other optimization strategies are available - check the Parameter Optimization Loop Start Node description for more.

The workflow makes use of the Create Local Big Data Environment node to create a Spark context. You can swap this node out for a Create Spark Context (Livy) node to connect to a remote cluster.

Mix and Match Parameter Optimization in Spark This workflow mixes standard KNIME nodes with the Spark nodes to find the optimal parameters for a k-means clustering using the hillclimbing approach. train model in Spark with k controlled by optimization loope.g. hillclimbinge.g. entropyKNIME tableto DataFrameCache DataFrameprior loopexecutiontraining datatest data Spark k-Means Parameter OptimizationLoop Start Spark MLlib to PMML Cluster Assigner ParameterOptimization Loop End Scoring Table to Spark Persist SparkDataFrame/RDD Create Local BigData Environment File Reader File Reader Mix and Match Parameter Optimization in Spark This workflow mixes standard KNIME nodes with the Spark nodes to find the optimal parameters for a k-means clustering using the hillclimbing approach. train model in Spark with k controlled by optimization loope.g. hillclimbinge.g. entropyKNIME tableto DataFrameCache DataFrameprior loopexecutiontraining datatest dataSpark k-Means Parameter OptimizationLoop Start Spark MLlib to PMML Cluster Assigner ParameterOptimization Loop End Scoring Table to Spark Persist SparkDataFrame/RDD Create Local BigData Environment File Reader File Reader

Nodes

Extensions

Links