ROC Curve (JavaScript)

This node draws ROC curves for two-class classification problems. The input table must contain a column with the real class values (including all class values as possible values) and a second column with the probabilities that an item (=row) will be classified as being from the selected class. Therefore only learners/predictors that output class probabilities can be used.

In order to create a ROC curve for a model, the input table is first sorted by the class probabilities for the positive class i.e. rows for which the model is certain that it belongs to the positive class are sorted to front. Then the sorted rows are checked if the real class value is the actually the positive class. If so, the ROC curve goes up one step, if not it goes one step to the right. Ideally, all positive rows are sorted to front, so you have a line going up to 100% first and then going straight to right. As a rule of thumb, the greater the area under the curve, the better is the model.

You may compare the ROC curves of several trained models by first joining the class probability columns from the different predictors into one table and then selecting several column in the column filter panel.

The black diagonal line in the diagram is the random line which is the worst possible performance a model can achieve.

Additionally a static SVG image can be rendered, which is then made available at the first output port.

Note, this node is currently under development. Future versions of the node might have more or changed functionality.

The node supports custom CSS styling. You can simply put CSS rules into a single string and set it as a flow variable 'customCSS' in the node configuration dialog. You will find the list of available classes and their description on our documentation page.

Options

ROC Curve Settings

Class column
Select the column that contains the two classes that the model was trained on.
Positive class value
Select the value from the class column that stands for the "positive" class, i.e. the value high probabilities in the probability column (see below) are assigned to.
Limit data points for each curve to
By default each curve shows at most 2,000 different data points regardless how may rows are in the input. If you want to see more or less points in the curve, adjust this value. Lower values make rendering the curves faster but this is only an issue if you have many different curves. A value of -1 disables the limit and shows all input data points.
Columns containing the positive class probabilities
Select the column(s) that contain the probabilities for the a row being from the positive class.
Ignore missing values
If checked, the missing values in Class or Positive Class Probabilities columns will be ignored without a corresponding warning message. Otherwise, missing values in the Class Column will be treated as incorrect predictions; missing values in the Positive Class Probabilities columns will be sorted to the end (low probability) of the curves. A corresponding warning message will be raised.

General Plot Options

Create image at outport
If an image is supposed to be rendered during execute for the upper outport. Disable this option if image is not needed or creation is too time consuming.
Chart title
The width of the generated SVG image.
Chart subtitle
The width of the generated SVG image.
Width of image (in px)
The width of the generated SVG image.
Height of image (in px)
The height of the generated SVG image.
Line width (in px)
The width of the shown lines.
Show area under curve
If set, the plot displays the size of the area under the curves
Resize view to fill window
Setting this option resizes the view to the available area of the window. If disabled the view size is static according to the set width and height.
Display full screen button
Displays a button enabling full screen mode.
Background color
The color of the background of the image.
Data area color
The background color of the data area, within the axes.
Show grid
If an additional grid is being rendered at the axes tick positions.
Grid color
The color of the grid.
Show warnings in view
If checked, warning messages will be displayed in the view when they occur.

Axis Configuration

Label for x axis
The text shown under the x axis
Label for y axis
The text shown next to the y axis
Show color legend
Whether to show a legend explaining the meaning of the lines in the plot.

View Controls

Enable view edit controls
If checked, the user can modify view parameters directly.
Enable title edit controls
If checked, the user can edit the view title in the view.
Enable subtitle edit controls
If checked, the user can edit the view subtitle in the view.
Enable label edit for x axis
If checked, the user can edit the label of the x axis in the view.
Enable label edit for y axis
If checked, the user can edit the label of the y axis in the view.

Input Ports

Icon
Data table with data to display.
Icon
A table with one column that contains column names as rows. The color of each row is used in the plot. If this port is not connected, default colors are used.

Output Ports

Icon
SVG image rendered by the JavaScript implementation of the ROC curve.
Icon
The areas under the ROC curves.

Views

Interactive View: ROC Curve
Displays a ROC curve visualization of the input data.

Workflows

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.