Skip to main content

JSONL

Each line is a JSON object representing one EvalCase.
from multivon_eval import load_jsonl

cases = load_jsonl("cases.jsonl")
cases.jsonl
{"input": "What is the capital of France?", "expected_output": "Paris", "tags": ["factual"]}
{"input": "Summarize this article.", "context": "The article discusses...", "tags": ["summarization"]}
{"input": "Is this review positive?", "expected_output": "yes", "metadata": {"source": "amazon"}}
Supported fields:
FieldTypeDescription
inputstringRequired. The prompt sent to the model
expected_outputstringFor ExactMatch, Contains, BLEU, ROUGE
contextstringFor Faithfulness, Hallucination, Summarization
tagslist[string]For filtering and grouping in reports
metadataobjectArbitrary key-value data attached to results

CSV

from multivon_eval import load_csv

cases = load_csv("cases.csv")
cases.csv
input,expected_output,context,tags
What is 2+2?,4,,math
Summarize this.,,Long text here...,summarization
Is Paris in France?,yes,,factual geography

Auto-detect format

from multivon_eval import load

cases = load("cases.jsonl")   # detects from extension
cases = load("cases.csv")

Filtering by tag

cases = load("cases.jsonl")
factual = [c for c in cases if "factual" in c.tags]

suite.add_cases(factual)

Building cases in code

from multivon_eval import EvalCase

cases = [
    EvalCase(
        input="What is the capital of France?",
        expected_output="Paris",
        tags=["factual", "geography"],
        metadata={"difficulty": "easy"},
    ),
    EvalCase(
        input="Explain how transformers work.",
        context="Transformers use self-attention mechanisms...",
        tags=["technical"],
    ),
]