discover multple categorical columns

This node helps discover multiple categorical columns in the dataset in one go among numeric columns. Many a time data is annonymized with a large number of numerical columns, some of which are, in fact, nominal. In this component, you specify the maximum number of distinct values for a numeric column. If distinct values are equal to or less than that specified, the column would be transformed to string column else not. The outputs of component are possible categorical columns and the rest of dataframe
The component uses 'Python Script' node to perform this function. It needs 'pandas' library.

Options

Max possible number of distinct values for a numerical column to be nominal column:
Max number of distinct values for a numeric column to have for it be qualified as a string column.
Do not consider columns whose names have the following characters embedded:
If a column name contains the following substring in its name ignore that column

Input Ports

Icon
KNIME dataset, as for example from a csv reader.

Output Ports

Icon
These are your numeric columns but possibly categorical%%00009
Icon
This is the rest of dataset

Nodes

Extensions

Links