Column Auto Type Cast

This node determines the most specific type in the configured string columns and changes the column types accordingly. The type order is to first check if the values are dates, then integer, long, double, and finally string. For dates a custom format can be specified.

Options

Column filter
Select the string columns to consider for automatic type casting. Only columns compatible with String are offered. The filter supports manual selection and wildcard/regex.
Choose a date&time format
Choose or enter a date, time, or date&time pattern used to detect dates in the selected columns. The used locale will be the system default. For further configurations use the String to Date&Time node.
The used parser depends on the setting Use legacy date&time type. When checked, the DateFormat is used, else the DateTimeFormatter. The DateFormat might not support every placeholder specified below.
Examples:
  • "yyyy.MM.dd HH:mm:ss.SSS" produces dates such as "2001.07.04 12:08:56.000"
  • "yyyy-MM-dd'T'HH:mm:ss.SSSZ" produces dates such as "2001-07-04T12:08:56.235-0700"
  • "yyyy-MM-dd'T'HH:mm:ss.SSSXXX'['VV']'" produces dates such as "2001-07-04T12:08:56.235+02:00[Europe/Berlin]"
Supported placeholders in the pattern are:
  • G: era
  • u: year
  • y: year of era
  • D: day of year
  • M: month in year (context sensitive)
  • L: month in year (standalone form)
  • d: day of month
  • Q/q: quarter of year
  • Y: week based year (you probably want to use y instead)
  • w: week of week based year
  • W: week of month
  • E: day of week
  • e: localized day of week
    • e: 4
    • ee: 04
    • eee: Wed
    • eeee: Wednesday
    • eeeee: W
  • c: day of week
  • F: day-of-week in month
  • a: am/pm of day
  • h: clock hour of am/pm (1-12)
  • K: hour of am/pm (0-11)
  • k: clock hour of am/pm (1-24)
  • H: hour of day (0-23)
  • m: minute of hour
  • s: second of minute
  • S: fraction of second
  • A: milli of day
  • n: nano of second
  • N: nano of day
  • V: time zone ID
  • z: time zone name
  • O: localized zone offset
  • x: zone offset (ISO8601)
    • X: +08 or +0830
    • XX: +0800 or +0830 (no colons)
    • XXX: +08:00 or +08:30 (with colons)
    • XXXX: +0800 or +083015 (i.e. including offset seconds, no colons)
    • XXXXX: +08:00 or +08:30:15 (i.e. including offset seconds, with colons)
  • X: same as x, but outputs Z when offset is 0
  • Z: zone offset (RFC822)
    • Z, ZZ, ZZZ: +0800 or +0830
    • ZZZZ: GMT+08:00 or GMT+08:30
    • ZZZZZ: +08:00 or +08:30:15
  • p: pad next
  • ' : escape for text
  • '': single quote
  • [: optional section start
  • ]: optional section end
Missing value pattern
Enter a missing value pattern applied to all included columns. Two special strings which will not be treated as pattern exist:
  • <none>: no pattern (default)
  • <empty>: for the empty string
Quickscan
Speed up by determining the most specific type based only on the first N rows. Note: With quickscan enabled this node may fail during execution if later rows contradict the inferred type.
Number of rows to consider
Number of initial rows used when quickscan is enabled.
Use legacy type names instead of identifiers
Output legacy type names like 'Number (double)' on the second port instead of identifiers like 'org.knime.core.data.def.DoubleCell'. This resembles the old behavior but is discouraged as type names may change in future versions.
Use legacy date&time type
Output date with the legacy date and time type (org.knime.core.data.date.DateAndTimeCell) or the successor types (org.knime.core.data.time.*.LocalTimeCell/LocalDateCell/LocalDateTimeCell/ZonedDateTimeCell).

Input Ports

Icon
Arbitrary input data.

Output Ports

Icon
Input data with type-casted columns.
Icon
Information about the chosen type casting.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.