Create Eval

Go to Product

Create the structure of an evaluation that can be used to test a model's performance. An evaluation is a set of testing criteria and the config for a data source, which dictates the schema of the data used in the evaluation. After creating an evaluation, you can run it on different models and model parameters. We support several types of graders and datasources. For more information, see the Evals guide.

Options

Body
Result Format

Specify how the response should be mapped to the table output. The following formats are available:

Structured Table: Returns a parsed table with data split into rows and columns.

  • Object: The object type.
  • Id: Unique identifier for the evaluation.
  • Name: The name of the evaluation.
  • Data Source Config: Configuration of data sources used in runs of the evaluation.
  • Testing Criteria: A list of testing criteria.
  • Created At: The Unix timestamp (in seconds) for when the eval was created.
  • Metadata:

    Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.

    Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.

Raw Response: Returns the raw response in a single row with the following columns:

  • body: Response body
  • status: HTTP status code

Input Ports

Icon
Configuration data.

Output Ports

Icon
Result of the request depending on the selected Result Format.
Icon
Configuration data (this is the same as the input port; it is provided as passthrough for sequentially chaining nodes to declutter your workflow connections).

Popular Predecessors

  • No recommendations found

Popular Successors

  • No recommendations found

Views

This node has no views

Workflows

  • No workflows found

Links

Developers

You want to see the source code for this node? Click the following button and we’ll use our super-powers to find it for you.