0 ×

Table Validator

KNIME Base Nodes version 4.1.0.v201912041211 by KNIME AG, Zurich, Switzerland

This node ensures a certain table structure and table content using a reference data table specification defined by the user in the configuration dialog. The base for the configuration is given by the specification of the input table during configuration and provides the basic template for the output table. It is ensured that the result table structure is mostly identical to the specification defined by the user. That is done by resorting of columns, the insertion of missing columns (filled with missing values) and optional removal of additional columns. You can also choose for each column (or a group of them) if it is required and if the data type or the domain should be checked/converted. To make use of this second approach, select a column or a list of columns to be handling, drag them to the appearing "+" button, and set the parameters. To remove this extra handling (and instead use the default handling), click the "Remove" button for this column. If the validation succeeds, data gets output to the first port (potentially renamed, sorted according to the user defined specification and with converted types). If the validation fails, the first port is inactive and the second port contains a table that lists all conflicts or the node fails. All options mentioned below marked with Data forces also a traversal of the input data.

Options

General settings

Behavior on validation issues
Defines how validation faults should influence the following workflow.
  • Fail node - Forces the node to fail; the exception carries a appropriate message containing detailed descriptions about the validation faults. A traversal of the data is canceled if the structural comparison already failed.
  • Deactivate first output port - The node will never fail but the first output port is set inactive. Validation results are presented at the second output port as a data table which contains the Column name, an Error ID (one of: COLUMN_NOT_CONTAINED, CONTAINS_MISSING_VALUE, INVALID_DATATYPE, CONVERTION_FAILED, OUT_OF_DOMAIN) and an human readable Description for each validation fault. The data is completely traversed, independent of potential structural differences. This option is useful if a complete validation of the input data is desired. For example if the workflow is used within the WebPortal, to avoid try and error passes.
Handling of unkown colums
Removes columns which are not included in the reference table spec.
  • Don't allow unknown columns - Unknown columns will force a validation issue.
  • Remove unkown columns - Unknown columns will be removed.
  • Sort them to the end - Unknown columns will shifted to the end of the table.

Validation Settings

Fail if column is missing (Structure)
Ensures that the configured columns exist in the input table. If case insensitive name matching is selected the first matching column will satisfy this condition.
Case insensitive name matching (Structure)
Also columns with an similar name will be considered to be validated according to this configuration. Users should take attention if using this option as the assignment from a column to a configuration is not trivial computed at runtime. The rules are explained in the following.
  1. Exact name match - Assigns the configuration with the exact name. The name is marked as used and cannot match any following input columns again.
  2. First matching configuration - Assigns the first configuration to the column with a matching name, the name is marked as used and cannot match any following input columns again.
Fail on missing value (Data)
Fails if the columns contains any missing value.
Check data type (Structure|Data)
Ensures a correct data type.
  • Fail if different - Fails if the reference data type is not a super type of the input column spec. I.e. it checks that the input column implements all DataValue classes that are also implemented by the reference column's data type.
  • Try to convert; fail if not compatible
  • Try to convert; insert missing if not compatible
Check possible values (Data)
Checks if each data object is contained in the possible values of the reference domain. The option is only enabled if any configured column defines possible values.
  • Fail if out of domain
  • Replace with missing values
Check min & max (Data)
Checks if each data object is between min and max defined by the domain of the reference specification. The option is only enabled if any configured column defines possible values.
  • Fail if out of domain
  • Replace with missing values
Set input table as reference
Sets the input table specification as reference specification.

Reference Spec

Reference Spec
The reference specification.
Input Spec
The input specification. Only visible if it differs from the reference specification.

Input Ports

Table to be validated.

Output Ports

Table with corrected and validated structure. Depending on the validation result and the Behavior if validation fails settings, this port may be inactive.
Table where missing values have been handled. Depending on the validation result and the Behavior if validation fails settings, this port may be inactive.

Best Friends (Incoming)

Best Friends (Outgoing)

Workflows

Installation

To use this node in KNIME, install KNIME Core from the following update site:

KNIME 4.1
Wait a sec! You want to explore and install nodes even faster? We highly recommend our NodePit for KNIME extension for your KNIME Analytics Platform.

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.