Excel Reader (XLS)

This Node Is Deprecated — This node is kept for backwards-compatibility, but the usage in new workflows is no longer recommended. The documentation below might contain more information.

This node reads a spread sheet and provides it at its output port.
It reads only data from one sheet at the moment. It can read only numeric, date, boolean and string data but, of course, no diagrams, pictures, or other items.
Currently supported KNIME types are String, TimeAndDate, Double, Boolean, and Int.
It reads in the data from the sheet and sets a type for all columns that is compatible with the data in that column (in the worst case "String" covers all).
The performance of the reader node is limited (due to the underlying library of the Apache POI project). Reading large files takes a very long time and uses a lot of memory (especially files in xlsx format when formula reevaluation is selected).

Options

File to read: Enter a valid file name. You can also choose a previously selected file from the drop-down list, or select a file from the "Browse..." dialog. Supported are xls and xlsx file formats. (Note: reading large xlsx files is very slow and memory consuming when formula reevaluation is selected.)
Sheet to Read: After selecting a file, you can select the sheet from the available sheets in the file.
Connect timeout[s]: Timeout in seconds when reading remote files. 0 means no timeout. In case it takes longer to connect to the host, the node fails to execute.
Column Names: If you want to use column names from the spread sheet, check "Table contains column names" and enter the number of the row that contains the column names (enter the number (starting at one), not the index).
If this option is not checked, default names are generated ("Col0", "Col1", etc.).
Row IDs: The option "Generate RowIDs (index incrementing, starting with 'Row0')" generates row ids starting from Row0 and produces all integer numbers till the end. The option "Generate RowIDs (index as per sheet content, skipped rows will increment index)" generates row ids based on sheet row indices starting from Row0, but numbers might get skipped when the row is skipped.
If you want to use row IDs from the spread sheet, check "Table contains row IDs" and provide the column that contains the row ids. Enter the label of the column ("A", "B", etc.) or the number (starting at one) of the column.
The rowIDs in the sheet must be unique, otherwise the execution fails. If you check "Make row IDs unique", the node will append a suffix to duplicates, ensuring row ID uniqueness. For very large data sets this could cause memory problems.
Select the columns and rows to read: Specify the region of the data sheet that should be read in. If you check "Read entire data sheet" the region provided by the data sheet file will be read in. This includes areas where diagrams, borders, coloring, etc. are placed and could create quite some empty rows or columns (see skip empty rows and columns options below).
If you want to read in a fixed area, remove the check mark and enter the first and last column, and first and last row to read. (Last row and column is optional, causing it to read to the last row or column provided by the sheet). For columns you need to enter the label ("A", "B", etc.), for rows enter the number (one-based).
Skip Empty Rows or Columns: If empty rows or columns should be removed from the result Data Table, check the appropriate option.
Skip hidden columns: Hidden columns are removed when checked.
Evaluation Error Handling: Specify the data that is inserted in a cell with an error. Not all formulas supported by all spreadsheet applications are supported by the XLS Reader Node. If an error occurs during formula evaluation (or if the spreadsheet contains an error cell), the selected data is inserted. You can either choose to insert a cell representing a missing value, or to insert a certain string pattern. A pattern causes the entire column to become a string column in case an error occurs. A missing value is type innocent, but also unobtrusive.
Reevaluate formulas: When checked it reads the whole file into memory and reevaluates the formulas before returning the result. This can cause errors when there are functions -not implemented by the Apache POI library- used on the sheet. When unchecked, xls files are still read to the memory, but the cached values are returned. For xlsx files, only the requested sheet is loaded when no reevaluation is requested and requires much less memory to return the cached values. If uncertain leave unchecked unless there is a concrete reason.
Disable Preview: Preview loading is disabled and no data table configuration is computed, so changes in the format of the xls(x) file will not generate errors after execution. The drawback of the enabled state of this option is that subsequent nodes do not have information about the output table structure.
Preview: The "Preview" tab shows you the output table with the current settings in the dialog. If the settings are invalid an error message will be displayed in this tab. The table updates only when the "refresh" button is clicked. A warning is shown, if the table content is out of sync with the current settings.
File Content: The "File Content" table shows you the content of the currently selected sheet. It shows the entire content (no settings are applied there). The column names here are the ones that need to be specified in the corresponding setting fields, row numbers are 0-based.

Input Ports

This node has no input ports

Output Ports

: The data table read in.

Popular Predecessors

Popular Successors

Views

This node has no views

Workflows

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.

Installation

To use this node in KNIME, install the extension KNIME Excel Support from the below update site following our NodePit Product and Node Installation Guide:

v5.5

A zipped version of the software site can be downloaded here.

Plugin provider: KNIME AG, Zurich, Switzerland

Plugin version: 5.5.0.v202412191418

On NodePit since: 2025-07-02

Last update: 2025-07-22

Tags: StreamableDeprecated

KNIME versions: Since v3.6

Deploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.

Try NodePit Runner!