Entropy Uncertainty Scorer

Calculates the entropy uncertainty score of a class probability distribution. Input are rows containing class probabilities P = p1, p2, ..., pn that must sum up to 1. Output will be the normalized Shannon entropy . This is defined by E(P) = H(P) / log(n) with H(P) = - sum(p_i*log(p_i) for each i in 1,...,n. The logarithm with base 2 is used. The normalization leads always to values between 0 and 1. A uniform probability distribution (i.e., most uncertain as all probabilities are equal to each other) has an entropy value of 1. If one of the class probabilities is 1 and the others 0, the highest certainty is given and the entropy value will be 0.

Options

Column Selection
Include the columns containing the class probabilities. The values must sum up to 1 for each data row.
Output column name
Set the name of the appended output column.
Invalid Input Handling
Specify the action if a data row of the input is invalid. Invalid could mean a missing value in the input or an invalid distribution (the probabilities must sum up to 1). If Fail is selected, the node will fail. Otherwise, the node just gives a warning and puts missing values in the output for the corresponding rows.

Input Ports

Icon
Table containing two or more columns containing class probabilities that sum up to 1.

Output Ports

Icon
Input data with an appended column that contains the uncertainty score.

Views

This node has no views

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.