Hive Connector

This Node Is Deprecated — This version of the node has been replaced with a new and improved version. The old version is kept for backwards-compatibility, but for all new workflows we suggest to use the version linked below.
Go to Suggested ReplacementHive Connector

This node is part of the deprecated database framework. For more information on how to migrate to the new database framework see the migration section of the database documentation.

This node creates a connection to a HiveServer2 via its JDBC driver. You need to provide the server's hostname (or IP address), the port, and a database name. Login credentials can either be provided directly in the configuration or via credentials set on the workflow.

The node supports the usage of the Cloudera JDBC drivers which are available for download from the Cloudera homepage. To register the driver follow the instructions on the database documentation.

Options

Hostname
The hostname (or IP address) of a HiveServer2.
Port
The port on which the hive server is listening. The default port is 10,000.
Database name
The name of the database you want to connect to.
Parameter
Optional connection parameter such as ssl or authentication options. See HiveServer2 Client documentation for details.

For Cloudera driver specific settings see the Cloudera JDBC Driver for Apache Hive Install Guide.

Use credentials
Selection this option if you want to provide authentication data via workflow credentials. Then select the desired credential name in the list below.
Use username & password
Provide a username and a password for authentication. The password may be optional if the server is configured accordingly.
Use Kerberos
Uses an existing Kerberos ticket for authentication. When connecting to HiveServer2 the principal of the HiveServer2 user needs to be added in the Parameter field e.g. principal=hive/hive2_host@YOUR-REALM.COM The principal must be the same user principal you used when starting the HiveServer2. For details see the Hive documentation or the Cloudera JDBC Driver for Apache Hive Install Guide.
Timezone correction
Select the TimeZone to convert the date, time or timestamp field into. The current implementation can't represent time zones. In order to support persisting those fields into a database, the time values can be changed according to the selected time zone, which is used as the offset (including the daylight saving time) for the original values:
  • No correction (use UTC) is used for workflows (created before 2.8) and doesn't apply any correction,
  • Use local timezone uses the local time zone offset to correct the date field before reading or writing, and
  • Use selected timezone allows selecting the time zone to covert the date values into.
Validate connection on close
Check this option if you want to validate the connection when closing the dialog.
Retrieve metadata in configure
This option controls subsequent nodes. Usually when a database node is configured it retrieves the metadata of the current table or query from the database for usage in subsequent nodes. If metadata retrieval takes quite some time it will slow down workflow configuration and execution noticeably, especially since metadata is retrieved in both configure and execute. In such cases it's better to switch this option off and only retrieve metadata during execute.

Input Ports

This node has no input ports

Output Ports

Icon
A database JDBC connection

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.