0 ×

Databricks File System Connector

KNIME Databricks Spark context and DBFS connector version 4.3.2.v202103021015

This node connects to the Databricks File System (DBFS) of a Databricks deployment. The resulting output port allows downstream nodes to access DBFS as a file system, e.g. to read or write files and folders, or to perform other file system operations (browse/list files, copy, move, ...).

Path syntax: Paths for DBFS are specified with a UNIX-like syntax, for example /myfolder/file.csv, which is an absolute path that consists of:

  1. A leading slash (/).
  2. The name of a folder (myfolder), followed by a slash.
  3. Followed by the name of a file (file.csv).

Options

Settings

Databricks URL
Full URL of the Databricks deployment, e.g. https://<account>.cloud.databricks.com on AWS or https://<region>.azuredatabricks.net on Azure.
Authentication
Workflow Token, username and password or credentials can be used for authentication. Databricks strongly recommends tokens. See authentication in Databricks AWS or Azure documentation for more informations about personal access token.
  • Username/password: Authenticate with the provided Username and Password. If entered here, the password is persistently stored (in encrypted form) in the settings of this node. Alternatively, if Use credentials is selected, the username and password of the selected credentials flow variable will be used for authentication.
  • Token: Authenticate with the provided personal access token. If entered here, the token is persistently stored (in encrypted form) in the settings of this node. Alternatively, if Use credentials is selected, the password of the selected credentials flow variable will be used as the token for authentication (username of the flow variable will be ignored).
Working directory
Specifies the working directory using the path syntax explained above. The working directory must be specified as an absolute path. A working directory allows downstream nodes to access files/folders using relative paths, i.e. paths that do not have a leading slash. If not specified, the default working directory is "/".

Advanced

Connection timeout
Timeout in seconds to establish a connection, or 0 for an infinite timeout.
Read timeout
Timeout in seconds to read data from an established connection, or 0 for an infinite timeout.

Output Ports

Icon
Databricks File System Connection

Installation

To use this node in KNIME, install KNIME Databricks Integration from the following update site:

KNIME 4.3

A zipped version of the software site can be downloaded here.

You don't know what to do with this link? Read our NodePit Product and Node Installation Guide that explains you in detail how to install nodes to your KNIME Analytics Platform.

Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform. Browse NodePit from within KNIME, install nodes with just one click and share your workflows with NodePit Space.

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.