Row Filter

The node filters an input table according to the given filter criteria. Each criterion can target the row number, RowID, or cell value of a row. Multiple criteria can be combined (similar to boolean logic via AND and OR) to specify the overall filter criterion applied to each row.

Options

Match row if matched by
Match the row if all or any criteria match:
  • All criteria: a row is matched if all of the criteria match (intersection of matches)
  • Any criterion: a row is matched if at least one of the criteria matches (union of matches)
Filter criteria
The list of criteria that should be filtered on.
  • Filter column: The column on which to apply the filter.
    The special column "RowID" represents the RowID of the input and is treated as a String column that is never missing. The special column "Row number" targets the 1-based row number of the input, is treated as a Long column and is never missing. Both special columns are always available, regardless of the input table spec or data.

    Columns containing data types that are non-native, i.e. contain cells of heterogeneous data types, or do not offer a conversion from and to a string representation are not supported and are filtered out from the available options.
    Collection columns are also not supported by the node.
  • Operator: The operator defines whether a particular value passes the filter criterion or not. A value matches the filter criterion, if the operator applied to it returns "true". If the operator returns "false" or a missing value, the value does not match the filter criterion. Not all operators offered by this node may be applicable to a particular column data type. Only the applicable operators are shown for the selected column.

    Missing value handling: All operators except "Is missing" or "Is not missing" return a missing cell if they encounter a missing cell as input. Therefore, a missing cell is matched if and only if the filter operator is "Is missing" or "Is not missing". Consequently, the filter behavior follows the semantics of SQL missing value filtering, e.g. when using the DB Row Filter node or SQL WHERE clause in the DB Query node.
    • Equals: Value in column must be equal to the specified reference value. Equality is define by the particular data type(s) involved and may be on the value's string representation.
    • Does not equal: Value in column must be not equal to specified reference value.
    • Less than: Value in column must be strictly smaller than specified value.
      This operator is applicable for all data types that offer a more meaningful ordering than just lexicographic ordering. In particular, this includes by default numeric types and Date & Time types. String and Boolean types are not supported. The same requirements apply to the other ordering-based operators: "Less than", "Less than or equal", "Greather than", and "Greater than or equal".
    • Less than or equal: Value in column must be smaller than or equal to specified value
    • Greater than: Value in column must be strictly larger than specified value
    • Greater than or equal: Value in column must be larger than or equal than specified value
    • First n rows: Matches the specified number of rows counted from the start of the input.
    • Last n rows: Matches the specified number of rows counted from the end of the input.
    • Matches regex: Value in column must match the specified regular expression.
      This operator is applicable to all data types that are string-compatible, i.e. offer a meaningful string representation of themselves, or integral numbers. In particular, this includes Date & Time types. The same requirements apply to the "Matches wildcard" operator.

      Regex matching behavior: By default, the regex pattern must match the whole cell value, not just parts of it, since the regex pattern is configured with the DOTALL and MULTILINE flags enabled. To disable the DOTALL flag, prefix the pattern with (?-s), to disable MULTILINE use prefix (?-m). To disable both, use (?-sm).
    • Matches wildcard: Value in column must match the specified pattern, which may contain wildcards * and ?.
    • Is true: Boolean value in column must be true
    • Is false: Boolean value in column must be false
    • Is missing: Value in column must be missing
    • Is not missing: Value in column must not be missing
  • Filter value: The value for the filter criterion.

    Note: Currently, comparison values for non-numeric and non-string data types, e.g. date&time-based types, must be entered as its string representation like in the Table Creator node.
    The format for date&time-based values is "ISO-8601 extended". For example, a "Local Date" must be entered in the format "2006-07-28". More information can be obtained from the ISO patterns in the "Predefined Formatters" table of the Java SE 17 documentation.
Column domains
Specify whether to take domains of all input columns as output domains as-is or compute them on the output rows.
Depending on the use case, one or the other setting may be preferable:
  • Retaining input columns can be useful, if the axis limits of a view should be derived from domain bounds, and that bounds should stay stable even when the displayed data is filtered.
  • Computing domains can be useful when a selection widget consumes the output and should only display actually present options to users.
If column domains are irrelevant for a particular use case, the "Retain" option should be used since it does not incur computation costs.
For more control over individual column domains, you can use the Domain Calculator, Edit Numeric Domain, or Edit Nominal Domain nodes.
  • Retain: Retain input domains on output columns, i.e. the upper and lower bounds or possible values in the table spec are not changed, even if one of the bounds or one value is fully filtered out from the output table. If the input does not contain domain information, so will the output.
  • Compute: Compute column domains on output columns, i.e. upper and lower bounds and possible values are computed only on the rows output by the node.
Filter behavior
Determines whether only matching or non-matching rows are output.

Input Ports

Icon
Data table from which to filter rows

Output Ports

Icon
Data table with rows meeting the specified criterion

Popular Predecessors

  • No recommendations found

Popular Successors

  • No recommendations found

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.