This node is currently not available in KNIME v5.10 — instead we’re showing this page for KNIME v3.7. You can use the version menu in the title bar to permanently switch your preferred version. This will also show the link to the update site.

Database to Spark

Reads a database query/table into a Spark RDD/DataFrame. See Spark documentation for more information.

Notice: This feature requires at least Apache Spark 1.5.

Options

Driver: Upload local driver (used in this KNIME instance) or depend on cluster side provided driver.
Fetch size: Optional: The JDBC fetch size, which determines how many rows to fetch per round trip. This can help performance on JDBC drivers which default to low fetch size (eg. Oracle with 10 rows).
Partition column, lower bound, upper bound, num partitions: These options must all be specified if any of them is specified. They describe how to partition the table when reading in parallel from multiple workers. partitionColumn must be a numeric column from the table in question. Notice that lowerBound and upperBound are just used to decide the partition stride, not for filtering the rows in table. So all rows in the table will be partitioned and returned.
Query DB for upper and lower count: Fetch bounds via min/max query or use manual entered bounds.

Input Ports

: Input query
: Required Spark context.

Output Ports

: Spark RDD/DataFrame

Popular Predecessors

Popular Successors

Views

This node has no views

Workflows

No workflows found

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

Installation

To use this node in KNIME, install the extension KNIME Extension for Apache Spark from the below update site following our NodePit Product and Node Installation Guide:

v3.7

A zipped version of the software site can be downloaded here.

Plugin provider: KNIME AG, Zurich, Switzerland

Plugin version: 2.4.0.v201811301556

On NodePit since: 2018-08-10

Last update: 2026-02-19

KNIME versions: From v3.6 to v3.7

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!