Optimal Binning

For feature selection/ elemination purposes, this component calculates IV(Information Values) for optimal categories of variables. This component also calculates WOE (Weight of Evidence) of categorized variables.




Step By Step Guide:

1- Initially, to run this component one should install Python Integration extensions.
2- For obtain a better Python node performance, pyarrow library should be installed.
3- Having installed pyarrow library, select serialization library as Apache Arrow under preferences. This option makes a huge difference as performance compared to Flatbuffers Column Serialization.
4- Then, specify desired IV threshold, target (label) and its bad category from dialog window. Target should be a string form to run this component.





Options

Specify Information Value Threshold
Enter Description
Specify the Target and Bad Category Respectively
The First dropdown is used for selecting the target, and the second is used for bad category selection.

Input Ports

Icon
Raw Data

Output Ports

Icon
Binning Results
Icon
Information Values For Each Features
Icon
Information Values For Over Threshold Features
Icon
Data For Optimal Binning (Apply)

Nodes

Extensions

Links