0 ×

Spark Column Filter

KNIME Extension for Apache Spark core infrastructure version 4.1.0.v201911281435 by KNIME AG, Zurich, Switzerland

This node allows columns to be filtered from the input Spark DataFrame/RDD while only the remaining columns are passed to the output Spark DataFrame/RDD. Within the dialog, columns can be moved between the Include and Exclude list.

Options

Manual Selection

Include
This list contains the names of those columns in the input Spark DataFrame/RDD to be included in the output Spark DataFrame/RDD.
Exclude
This list contains the names of those columns in the input Spark DataFrame/RDD to be excluded from the output Spark DataFrame/RDD.
Filter
Use one of these fields to filter either the Include or Exclude list for certain column names or name substrings.
Buttons
Use these buttons to move columns between the Include and Exclude list. Single-arrow buttons will move all selected columns. Double-arrow buttons will move all columns (filtering is taken into account).
Enforce Exclusion
Select this option to enforce the current exclusion list to stay the same even if the input Spark DataFrame/RDD specification changes. If some of the excluded columns are not available anymore, a warning is displayed. (New columns will automatically be added to the inclusion list.)
Enforce Inclusion
Select this option to enforce the current inclusion list to stay the same even if the input Spark DataFrame/RDD specification changes. If some of the included columns are not available anymore, a warning is displayed. (New columns will automatically be added to the exclusion list.)

Wildcard/Regex Selection

Type a search pattern which matches columns to move into the Include list. You can use either Wildcards ('?' matching any character, '*' matching a sequence of any characters) or Regex. You can specify whether your pattern should be case sensitive.

Type Selection

Select the column types that you want to include. Column types that are currently not present are depicted in italic.

Input Ports

Spark DataFrame/RDD from which columns are to be excluded.

Output Ports

Spark DataFrame/RDD excluding selected columns.

Best Friends (Incoming)

Best Friends (Outgoing)

Workflows

Installation

To use this node in KNIME, install KNIME Extension for Apache Spark from the following update site:

KNIME 4.1
Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform.

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.