Kullback–Leibler Divergence

This component computes the Kullback-Leibler divergence to output a measure of dissimilarity between two distribution of the same variable, which come from two different datasets.
It could be very useful to identify shift inside data which can potentially lead to a model drift, downgrading the predictive power.
Column names and types must be identical between the two tables.
Continous predictors are binned into classes to make computation of the metric easier.
A value close to 0 means that the two variables have the same distribution.
As the value increases, the dissimilarity is higher.
Columns must have the same name and the same data type.

Input Ports

: Input table 1
: Input table 2

Output Ports

: Kullback-Leibler measure for each input column

Nodes

Constant Value Column13 ×
GroupBy11 ×
Math Formula10 ×
Column Rename (Regex)5 ×
Missing Value5 ×
Show all 28 nodes

Extensions

FeatureKNIME Base nodes
FeatureKNIME JavaScript Views (Labs)
FeatureKNIME Javasnippet
FeatureKNIME Math Expression (JEP)

Kullback–Leibler Divergence

Input Ports

Output Ports

Nodes

Extensions

Links

Download