Binary Scorer

Takes an input containing a binary activity (experimental/ground truth) and one or more prediction columns. For multi class classifications you should use the KNIME Scorer node. This node has been developed for binary classification and you must specify the value of active (positive) and inactive (negative). Values can be specified for equivocal and out of domain regardless of whether they are present in the prediction column.

Missing values are handled in the following ways: missing activity is ignored completely regardless of selection of "Missing out of domain". Selecting the missing out of domain option will increment the out of domain count when the prediction value is missing but the activity value is present.

Values that are not mapped to either active, inactive, equivocal or out of domain will be treated as errors and not contribute to the metric calculation.

Target values that do not match the active or inactive value specified are not included in the calculation.



Calculates:

Balanced accuracy: Sensitivity + Specificity / 2

Accuracy: TP + TN / 2

Sensitivity: TP / (TP + FN)

Specificity: TN / (TN + FP)

Precision aka Positive Predictivity (PPV): TP / (TP + FP)

Negative predictivity (NPV):TN / (TN + FN)

Recall: TP / (TP + FN)

F-Measure: 2 * ((precision * recall) / (precision + recall))

MCC: Matthews correlation coefficient / Karl Pearson's phi coefficient

Youden's J Statistic: sensitivity + specificity - 1

Balanced PPV: sensitivity / sensitivity + 1 - specificity

Balanced NPV: specificity / specificity + 1 - sensitivity

Coverage out of domain / total. The total included equivocal

Also outputs the counts for TP, FP, TN, FN, number of equivocals and number of out of domains and coverage (% not out of domain).



Note that the number of equivocals and number out of domain do not impact on the Cooper statistics (Sensitivity, specificity etc.)

Options

Active (positive) string
String value that represents active
Inactive (negative) string
String value that represents inactive
Equivocal string
String value that represents an equivocal result (prediction only)
Out of domain string
String value that represents a compound which is out of the model's domain
Activity column
True/experimental activity
Predictions
Predictions, new row for each selected column.
Missing as out of domain
Select true to increment the out of domain count when a row contains a missing cell at the prediction column. Note that if the activity column is also missing then the out of domain count will not be incremented as target column errors are excluded from the calculation.

Input Ports

Icon
Target column (needs to be binary) and at least 1 prediction column

Output Ports

Icon
Performance metrics on the selected prediction columns. RowID is the column header of the selected columns.
Icon
Rows containing at least one erroneous value in the target or prediction columns.

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.