Spark Predictor (Classification)

This node classifies/labels input data using a previously learned Spark ML classification model. Please note that all feature columns selected during model training must be present in the ingoing DataFrame.

Note: This node is not compatible with Spark MLlib models. For these models please use the Spark Predictor node.

This node requires at least Apache Spark 2.0.

Options

Change prediction column name: When set, you can change the name of the prediction column. The default name is "Prediction (targetcolumn)".
Prediction Column: The desired name for the prediction column.
Append individual class probabilities: Select to append the class probability of each class to the output. For each class, a new column with name "P (targetcolumn=class)" will appended.
Append individual class probabilities: If class probabilities are appended, the suffix allows you to avoid duplicate column names. Can be empty.

Input Ports

: Spark ML classification model to use.
: Spark DataFrame containing the input data to classify.

Output Ports

: Input DataFrame with appended prediction column and, if selected, columns for the class probabilities.

Popular Predecessors

Popular Successors

Views

This node has no views

Workflows

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

Installation

To use this node in KNIME, install the extension KNIME Extension for Apache Spark (legacy) from the below update site following our NodePit Product and Node Installation Guide:

v5.6

A zipped version of the software site can be downloaded here.

Plugin provider: KNIME AG, Zurich, Switzerland

Plugin version: 5.6.0.v202507151409

On NodePit since: 2025-08-15

Last update: 2025-08-16

KNIME versions: Since v4.0

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!