Spark Numeric Scorer

This node computes certain statistics between the a numeric column's values (ri) and predicted (pi) values. It computes =1-SSres/SStot=1-Σ(pi-ri)²/Σ(ri-1/n*Σri)² (can be negative!), mean absolute error (1/n*Σ|pi-ri|), mean squared error (1/n*Σ(pi-ri)²), root mean squared error (sqrt(1/n*Σ(pi-ri)²)), and mean signed difference (1/n*Σ(pi-ri)). The computed values can be inspected in the node's view and/or further processed using the output table.

Options

Reference column
Column with the correct, observed, training data values. Rows with missing values in selected column will be ignored.
Predicted column
Column with the modeled, predicted data values. Computation will fail if selected column contains missing values.
Change column name
Change the default output column name.
Output column name
The name of the column in the output.
Output scores as flow variables
The scores can be exported as flow variables.
Prefix of flow variables
This option allows you to define a prefix for these variable identifiers so that name conflicts are resolved.

Input Ports

Icon
Arbitrary input Spark DataFrame/RDD with at least two numeric columns to compare.

Output Ports

Icon
The computed statistical measures:

Views

Statistics
A table with the statistical measures

Workflows

Further Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.