Model Monitor View (Compare)

This Component generates a view by comparing the performance of two models captured by Integrated Deployment. The Component displays the performance of a new model starting from the date chosen by the user in the configuration dialogue, while the performance of the original model is displayed for all the dates in the input data. In a deployment scenario, this component compares the performance of the previous deployed model with the recently retrained model given the chosen evaluation metric.

The view works for machine learning classifiers for binary as well as multiclass targets. The Component requires the deployment data with timestamps (dates) and target columns in order to showcase the performance over time.

In the Interactive View generated, the performance metric is plotted with respect to the time axis, and further, a trend line is plotted based on this performance of each model. A “Deploy” button has been provided in the view. Based on the model performance the user can decide if deployment is necessary of the model provided in the second input port. This deployment decision is given at the output of the component via a flow variable. Connect the flow variable output to the workflow branch which deploys the model. Such a branch should execute only if the user checked the box in the view and applied its settings (Apply&Close lower right corner).

CAPTURED MODEL REQUIREMENTS (Top and Middle Port)
We recommend using the "AutoML" components with this component. All you need is connect the two components via the black integrated deployment port.

You can also monitor customly trained models with this component. When providing models not trained by the “AutoML” components, you need to satisfy the below black box requirements:

- The models should be captured with Integrated Deployment and have a single input and single output of type Data.

- All features columns have to be provided at the model input.

- Any other additional columns that are not features can be provided at the model input.

- The model output should store all the model input data (features and non-features) and present attached the output predictions columns.

- The model output predictions should be one String type and “n” Double type, where “n” is the number of classes in the target column.

- The String type prediction column should be named “Prediction([T])” where [T] is the name of your target class (e.g. “Prediction (Churn)”).

- The Double type prediction columns should be named “P ([T]=[C1])”, “P ([T]=[C2])”, …, “P (T=[Cn])”, where [Cn] is the name of the class that probability is predicting (e.g. “P (Churn=not churned)” and ”P (Churn=churned)” in the binary case).

Additionally, if you are not using the AutoML component, you need to provide a flow variable called “target_column” of type String with the name of your ground truth/target column in the model ports of the “Model Monitor View (Compare)“ Component.


INPUT DEPLOYMENT TABLE REQUIREMENTS (Bottom Port)
- All features columns that were used in the training of the captured models
- Availability of target column and timestamp column. Each record timestamp tracks the date in which the currently deployed model (first input) was applied on that data row. The timestamp should be of “Date&Time” column Types. “Time” and “String” types are not supported. Use the “String to Date&Time” node. The timestamp column should be uniformly distributed across the sample: time ranges in between dates where samples are missing should be somewhat constant.

Options

Timestamp Column
Select the column containing your time stamps.
Select Start Date for Performance Comparison
The performance of the model to be displayed for the second newer model. The newer model is usually the old one retrained on more deployment data. This date should refer to the new time split between the larger training set and the new test set.
Records per Scoring Window
Configure the size of the windows used to create the accuracy plot.
Metric for Model Monitoring
Choose the metric for performance evaluation and monitoring as well
Select Positive Class
Select the target class category to compute the performance of the classification model. In the case of binary classification we suggests to use the positive class

Input Ports

Icon
The currently deployed model captured with Integrated Deployment.
Icon
The new model, also captured with the Integrated Deployment, to be compared with the original, and optionally deployed.
Icon
Deployment data with target/ground truth and timestamp: - A table of instances gathered after the deployment where the currently deployed model (first input) was previously applied. - The collected afterwards ground truth (also called target) column should also be available. - The timestamp needs to be already converted to either "Date&Time" Column Type or similar as long as it has a date and it is not String Column Type.

Output Ports

Icon
Flow variable to activate downstream workflow branch for deployment update.

Nodes

Extensions

Links