0 ×

Create Spark Context

KNIME Extension for Apache Spark core infrastructure version 2.3.2.v201811051556 by KNIME AG, Zurich, Switzerland

Creates a new Spark context. The settings affect all connected Spark nodes and their successors.

The default values of the node are taken from the Spark preference page which you can open via File->Preferences->KNIME->Spark.

Options

Context Settings

Spark version
The Spark version used on Spark Jobserver.
Context name
The unique name of the context.
Destroy Spark context on dispose
If selected, the Spark context will be destroyed when the workflow or KNIME is closed. This way, resources on the cluster are released, but all data cached inside the Spark context are lost, unless they have been saved to persistent storage such as HDFS.
Delete Spark DataFrames/RDDs on dispose
If selected, KNIME deletes the created RDDs/DataFrames when the workflow or KNIME is closed. This way, resources on the cluster are released, but all data that the current workflow holds in the Spark context are lost, unless they have been saved to persistent storage such as HDFS.
Spark job log level
The log level to use for Spark jobs within the Spark runtime.
Override Spark settings
Select this option to set custom Spark settings.
Custom Spark settings
Custom Spark settings to add to or overwrite the default settings defined by Spark Jobserver. See Spark documentation for more information.
Hide warning about an existing Spark context
Enable this option to suppress a warning message shown when the Spark context to be created by this node already exists.

Connection Settings

Jobserver URL
The URL of the Spark Jobserver including protocol and port e.g. http://localhost:8090.
Authentication
Select
  • None, if Spark jobserver does not require any credentials.
  • Username & password and enter the respective credentials, which will be saved with the workflow.
  • Credentials and select from the available credentials.
Jobserver response timeout (seconds)
Time to wait for a response when making a request to Spark Jobserver (0 is infinite).
Spark job check frequency (seconds)
The frequency with which KNIME polls the status of a job.

Output Ports

Icon
Spark context.

Best Friends (Incoming)

Best Friends (Outgoing)

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.