HDFS Connector

This node connects to a Hadoop Distributed File System using HDFS, WebHDFS or HTTPFS. The resulting output port allows downstream nodes to access the files of the remote file system, e.g. to read or write, or to perform other file system operations (browse/list files, copy, move, ...).

Path syntax: Paths for HDFS are specified with a UNIX-like syntax, /myfolder/myfile. An absolute for HDFS consists of:

  1. A leading slash ("/").
  2. Followed by the path to the file ("myfolder/myfile" in the above example).

SSL: This node uses the JVM SSL settings.

Options

Settings

Protocol
HDFS protocol to use.
Host
Address of HDFS name node or WebHDFS/HTTPFS node.
Port
Use the default or a custom port to connect to HDFS name node or WebHDFS/HTTPFS node.

Note: The WebHDFS default ports are the Hadoop 3.x default ports. The default WebHDFS port on Hadoop 2.x is 50070 and 50470 with SSL.
Authentication
  • Username: Pseudo/Simple authentication using a given username.
  • Kerberos: Kerberos ticket based authentication.
Working directory
Specify the working directory of the resulting file system connection, using the Path syntax explained above. The working directory must be specified as an absolute path. A working directory allows downstream nodes to access files/folders using relative paths, i.e. paths that do not have a leading slash. The default working directory is the root "/", under which all the document libraries are located.

Input Ports

This node has no input ports

Output Ports

Icon
HDFS File System Connection.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.