0 ×

Spark PCA

KNIME Extension for Apache Spark core infrastructure version 4.1.0.v201911281435 by KNIME AG, Zurich, Switzerland

This node performs a principal component analysis (PCA) on the given data using the Apache Spark implementation. The input data is projected from its original feature space into a space of (possibly) lower dimension with a minimum of information loss.

Options

Fail if missing values are encountered
If checked, execution fails, when the selected columns contain missing values. By default, rows containing missing values are ignored and not considered in the computation of the principal components.
Target dimensions
Select the number of dimensions the input data is projected to. You can select either one of:
  • Dimensions to reduce to: Directly specify the number of target dimensions. The specified number must be lower or equal than the number of input columns.
  • Minimum information fraction to preserve (%): Specify the fraction in percentage of information to preserve from the input columns. This option requires Apache Spark 2.0 or higher.
Replace original data columns
If checked, the projected DataFrame/RDD will not contain columns that were included in the principal component analysis. Only the projected columns and the input columns that were not included in the principal component analysis remain.
Columns
Select columns that are included in the analysis of principal components, i.e the original features.

Input Ports

Input Spark DataFrame/RDD

Output Ports

The input DataFrame/RDD projected onto the principal components. Input columns that were not included in the principal component analysis are retained.
A DataFrame/RDD with the principal components matrix.

Best Friends (Incoming)

Best Friends (Outgoing)

Workflows

Installation

To use this node in KNIME, install KNIME Extension for Apache Spark from the following update site:

KNIME 4.1
Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform.

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.