This Component generates a view monitoring the performance of a model captured by Integrated Deployment. The view works for machine learning classifiers for binary as well as multiclass targets. The Component requires the deployment data with timestamps (dates) and target columns in order to showcase the performance over time.
In the Interactive View generated, the performance metric is plotted with respect to the time axis, and further, a trend line is plotted based on this performance. A “Retrain” button has been provided in the view. Based on the model performance the user can decide if retraining is necessary. This retraining decision is given at the output of the component via a flow variable. Connect the flow variable output to the workflow branch which retrains the model. Such branch should execute only if the user checked the box in the view.
CAPTURED MODEL REQUIREMENTS (Top Port)
We recommend using the "AutoML" component with this component. All you need is connect the two components via the black integrated deployment port.
You can also monitor a customly trained model with this component. When providing a model not trained by the “AutoML” component, you need to satisfy the below black box requirements:
- The model should be captured with Integrated Deployment and have a single input and single output of type Data.
- All features columns have to be provided at the model input.
- Any other additional columns that are not features can be provided at the model input.
- The model output should store all the model input data (features and non-features) and present attached the output predictions columns.
- The model output predictions should be one String type and “n” Double type, where “n” is the number of classes in the target column.
- The String type prediction column should be named “Prediction([T])” where [T] is the name of your target class (e.g. “Prediction (Churn)”).
- The Double type prediction columns should be named “P ([T]=[C1])”, “P ([T]=[C2])”, …, “P (T=[Cn])”, where [Cn] is the name of the class that probability is predicting (e.g. “P (Churn=not churned)” and ”P (Churn=churned)” in the binary case).
Additionally, if you are not using the AutoML component, you need to provide a flow variable called “target_column” of type String with the name of your ground truth/target column in the top input of the “Model Monitor View“ Component.
INPUT DEPLOYMENT TABLE REQUIREMENTS (Bottom Port)
- All features columns that were used in the training of the captured model.
- Availability of target column and timestamp column. Each record timestamp tracks the date in which the model was applied on that data row. The timestamp should be of “Date&Time” column Types. “Time” and “String” types are not supported. Use the “String to Date&Time” node. The timestamp column should be uniformly distributed across the sample: time ranges in between dates where samples are missing should be somewhat constant.
To use this component in KNIME, download it from the below URL and open it in KNIME:
Download ComponentDeploy, schedule, execute, and monitor your KNIME workflows locally, in the cloud or on-premises – with our brand new NodePit Runner.
Try NodePit Runner!Do you have feedback, questions, comments about NodePit, want to support this platform, or want your own nodes or workflows listed here as well? Do you think, the search results could be improved or something is missing? Then please get in touch! Alternatively, you can send us an email to mail@nodepit.com.
Please note that this is only about NodePit. We do not provide general support for KNIME — please use the KNIME forums instead.