Spark Normalizer

This node normalizes the values of all selected (numeric) columns.


Min-max normalization
Linear transformation of all values such that the minimum and maximum in each column are as given.
Z-score normalization (Gaussian)
Linear transformation such that the values in each column are Gaussian-(0,1)-distributed, i.e. mean is 0.0 and standard deviation is 1.0.
Normalization by decimal scaling
The maximum value in a column (both positive and negative) is divided j-times by 10 until its absolute value is smaller or equal to 1. All values in the column are then divided by 10 to the power of j.

Input Ports

Spark DataFrame/RDD requiring normalization of some or all columns.

Output Ports

Spark DataFrame/RDD with normalized columns.
PMML document containing normalization parameters, which can be used in the "Spark Compiled Transformations Applier" node to normalize test data the same way as the training data has been normalized.

Popular Predecessors

Popular Successors


This node has no views




You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.