0 ×

Spark Normalizer

KNIME Extension for Apache Spark core infrastructure version 4.0.0.v201907300820 by KNIME AG, Zurich, Switzerland

This node normalizes the values of all selected (numeric) columns.

Options

Min-max normalization
Linear transformation of all values such that the minimum and maximum in each column are as given.
Z-score normalization (Gaussian)
Linear transformation such that the values in each column are Gaussian-(0,1)-distributed, i.e. mean is 0.0 and standard deviation is 1.0.
Normalization by decimal scaling
The maximum value in a column (both positive and negative) is divided j-times by 10 until its absolute value is smaller or equal to 1. All values in the column are then divided by 10 to the power of j.

Input Ports

Spark DataFrame/RDD requiring normalization of some or all columns.

Output Ports

Spark DataFrame/RDD with normalized columns.
PMML document containing normalization parameters, which can be used in the "Spark Compiled Transformations Applier" node to normalize test data the same way as the training data has been normalized.

Best Friends (Incoming)

Best Friends (Outgoing)

Workflows

Installation

To use this node in KNIME, install KNIME Extension for Apache Spark from the following update site:

KNIME 4.0
Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform.

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.