Icon

02 Application test - Prediction workflow output - exercise

Application testflow. This workflow tests the prediction workflow's output. If a preprocessing or a model change, the predictions also change. However, the output must satisfy the following:

This test is especially important before the deployment but can also be executed regularly when the workflow is already deployed.

Step 2. Validate the prediction workflow output

  1. Add the Table Validator node to the output of the prediction workflow and create three groups of columns*:

    • Prediction columns

      • Add the columns: P (SeriousDlqin2yrs=0), P (SeriousDlqin2yrs=1),Prediction (SeriousDlqin2yrs)

      • Should fail if there are missing values, if the columns are missing, if the data types are different, and if the values (both string and numeric) are out of domain

    • Identifying columns

      • Add the columns: Column0

      • Should fail if there are missing values, if the columns are missing, and if the data types are different.

    • Feature columns

      • Add all the other columns

      • Should fail only if the columns are missing and if the data types are different


* To add a new group, in the column list, double click on any column that should be in this group. To add a column to an existing group drag and drop it from the column list to the group


Step 1. Import the prediction workflow and execute it on the test data

  1. Read the prediction workflow with the Workflow Reader node

    • Use a workflow relative path

    • Check the Remove inputand output nodes option to be able to provide input and export output with the Workflow Executor node

  2. Execute the prediction workflow with the input test request for prediction with the Workflow Executor node*.

    • Connect the Workflow Object Ports

      • Auto-adjust the ports and click OK

      • Provide the input test request as an input


* Why Workflow Executor node over Call Workflow node here? During testing, we want to avoid overloading the prediction workflow that can be used by real users. Instead, we want to read it into this testflow and execute it within this testflow.


Part 1 - Deployment fundamentals

Exercise workflow 02 Application test - Prediction workflow output

Learning objective: In this exercise you'll practice the creation of an application test that tests the whole prediction workflow


Workflow description: Application testflow. This workflow tests the prediction workflow's output. If a preprocessing or a model change, the predictions also change. However, the output must satisfy the following:

  • The table structure and the column names shouldn't change (so that the prediction workflow can continue to be used by all the users)

  • The domain of the prediction column should stay the same (only 0 and 1 are allowed for binary classification)

  • No missing values are allowed in the identifying and prediction columns

This test is especially important before the deployment but can also be executed regularly when the workflow is already deployed.


You'll find the instructions to the exercises in the yellow annotations.

How do you define the validation rules?

  • Prediction columns: very clear expectations to predicted information -> very strict validation rules: each data point should get a prediction, class can be only 0 or 1, the probability can only range between 0 and 1.

  • Identifying columns: absolutely necessary to have the full information to map prediction to a data point -> hence no missing values rule. At the same time, the domain is not restricted since new IDs will expectedly appear.

  • Feature columns: Missing values are expected in the feature columns, the domain can differ.

Test request for prediction
Table Reader

Nodes

Extensions

Links