Create Spark Context (Jobserver)

This Node Is Deprecated — This node is kept for backwards-compatibility, but the usage in new workflows is no longer recommended. The documentation below might contain more information.
Creates a new Spark context via Spark Jobserver.

Support for Spark Jobserver is deprecated and the Create Spark Context (Livy) node should be used instead.

The default settings of the node are taken from the Spark preference page which you can open via File->Preferences->KNIME->Big Data->Spark.

Options

Context Settings

Spark version
The Spark version used by Spark Jobserver.
Context name
The unique name of the context.
Destroy Spark context on dispose
If selected, the Spark context will be destroyed when the workflow or KNIME is closed. This way, resources on the cluster are released, but all data cached inside the Spark context are lost, unless they have been saved to persistent storage such as HDFS.
Delete Spark DataFrames/RDDs on dispose
If selected, KNIME deletes the created RDDs/DataFrames when the workflow or KNIME is closed. This way, resources on the cluster are released, but all data that the current workflow holds in the Spark context are lost, unless they have been saved to persistent storage such as HDFS.
Spark job log level
The log level to use for Spark jobs within the Spark runtime.
Override Spark settings
Select this option to set custom Spark settings.
Custom Spark settings
Custom Spark settings to add to or overwrite the default settings defined by Spark Jobserver. See Spark documentation for more information.
Hide warning about an existing Spark context
Enable this option to suppress a warning message shown when the Spark context to be created by this node already exists.

Connection Settings

Jobserver URL
The URL of the Spark Jobserver including protocol and port e.g. http://localhost:8090.
Authentication
Select
  • None, if Spark jobserver does not require any credentials.
  • Username & password and enter the respective credentials, which will be saved with the workflow.
  • Credentials and select from the available credentials.
Jobserver response timeout (seconds)
Time to wait for a response when making a request to Spark Jobserver (0 is infinite).
Spark job check frequency (seconds)
The frequency with which KNIME polls the status of a job.

Input Ports

This node has no input ports

Output Ports

Icon
Spark context.

Popular Predecessors

  • No recommendations found

Popular Successors

  • No recommendations found

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.