0 ×

04_​Parameter_​Optimization_​in_​Spark

Workflow

Mix and match Spark nodes with other KNIME nodes

This workflow mixes standard KNIME nodes with the Spark nodes to find the optimal parameters for a k-means clustering using the hillclimbing approach. Other optimization strategies are available - check the Parameter Optimization Loop Start Node description for more.

The workflow makes use of the Create Local Big Data Environment node to create a Spark context. You can swap this node out for a Create Spark Context (Livy) node to connect to a remote cluster.

Parameter OptimizationGrid SearchHillclimbing
Mix and Match Parameter Optimization in Spark This workflow mixes standard KNIME nodes with the Spark nodes to find the optimal parameters for a k-means clustering using the hillclimbing approach. training datatrain model in Spark with k controlled by optimization loope.g. hillclimbinge.g. entropytest dataKNIME tableto DataFrameCache DataFrameprior loopexecution File Reader Spark k-Means Parameter OptimizationLoop Start Spark MLlib to PMML Cluster Assigner ParameterOptimization Loop End Scoring File Reader Table to Spark Persist SparkDataFrame/RDD Create Local BigData Environment Mix and Match Parameter Optimization in Spark This workflow mixes standard KNIME nodes with the Spark nodes to find the optimal parameters for a k-means clustering using the hillclimbing approach. training datatrain model in Spark with k controlled by optimization loope.g. hillclimbinge.g. entropytest dataKNIME tableto DataFrameCache DataFrameprior loopexecutionFile Reader Spark k-Means Parameter OptimizationLoop Start Spark MLlib to PMML Cluster Assigner ParameterOptimization Loop End Scoring File Reader Table to Spark Persist SparkDataFrame/RDD Create Local BigData Environment

Download

Get this workflow from the following link: Download

Nodes

04_​Parameter_​Optimization_​in_​Spark consists of the following 13 nodes(s):

Plugins

04_​Parameter_​Optimization_​in_​Spark contains nodes provided by the following 3 plugin(s):