Delta Table Reader (Labs)

Reads Delta Lake tables by loading the latest snapshot of the specified table. This node can optionally utilize a file system input port, such as Microsoft Fabric OneLake, to access remote storage. It is designed for ease of use in reading tabular data from Delta Lake, though it currently does not support complex nested structures.

Note: When selecting a folder, ensure you choose the root directory of the Delta Table, which must contain the _delta_log folder.

Options

Source
Select a file location which stores the data you want to read. When clicking on the browse button, there are two default file system options to choose from:
  • The current Hub space: Allows to select a file relative to the Hub space on which the workflow is run.
  • URL: Allows to specify a URL (e.g. file://, http:// or knime:// protocol).

Note: When selecting a folder, make sure to select the root folder of the Delta Table which contains the _delta_log folder.
Skip first data rows
Use this option to skip the specified number of data rows.
Limit number of rows
If enabled, only the specified number of data rows are read.
If there are unsupported column types
Delta tables can contain columns with types that are not supported by this node, for example complex nested types. This option allows to select whether the node should fail, or just ignore such columns.
  • Fail: If set, the node fails on Delta Tables with unsupported column types
  • Ignore columns: If set, the columns with unsupported column types are ignored.
If schema changes
Specifies the node behavior if the content of the configured file/folder changes between executions, i.e., columns are added/removed to/from the file(s) or their types change. The following options are available:
  • Fail: If set, the node fails if the column names in the file have changed. Changes in column types will not be detected.
  • Use new schema: If set, the node will compute a new table specification for the current schema of the file at the time when the node is executed. Note that the node will not output a table specification before execution and that it will not apply transformations, therefore the transformation tab is disabled.
  • Ignore (deprecated): If set, the node tries to ignore the changes and outputs a table with the old table specification. This option is deprecated and should never be selected for new workflows, as it may lead to invalid data in the resulting table. Use one of the other options instead.
Enforce types
Controls how columns whose type changes are dealt with. If selected, the mapping to the KNIME type you configured is attempted. The node will fail if that is not possible. If unselected, the KNIME type corresponding to the new type is used.
Transformations
Use this option to modify the structure of the table. You can deselect each column to filter it out of the output table, use the arrows to reorder the columns, or change the column name or column type of each column. Note that the positions of columns are reset in the dialog if a new file or folder is selected. Whether and where to add unknown columns during execution is specified via the special row <any unknown new column>. It is also possible to select the type new columns should be converted to. Note that the node will fail if this conversion is not possible e.g. if the selected type is Integer but the new column is of type Double.

Input Ports

Icon
File system

Output Ports

Icon
Delta Table

Popular Predecessors

  • No recommendations found

Popular Successors

  • No recommendations found

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.