Documentation Index
Fetch the complete documentation index at: https://docs.gaussia.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The Vision module provides two complementary metrics for evaluating Vision Language Models (VLMs):- VisionSimilarity: How accurately the VLM describes scenes compared to human ground truth
- VisionHallucination: How often the VLM describes content not present in the scene
SimilarityScorer (defaulting to cosine similarity with all-mpnet-base-v2).
VisionSimilarity
Measures semantic similarity between VLM descriptions and human annotations.Output
| Field | Type | Description |
|---|---|---|
mean_similarity | float | Average similarity across all frames |
min_similarity | float | Minimum similarity score |
max_similarity | float | Maximum similarity score |
summary | str | Human-readable summary |
interactions | list[VisionSimilarityInteraction] | Per-frame scores |
VisionHallucination
Flags frames where similarity falls below a threshold as hallucinations.Output
| Field | Type | Description |
|---|---|---|
hallucination_rate | float | Fraction of hallucinated frames |
n_hallucinations | int | Number of hallucinated frames |
n_frames | int | Total frames evaluated |
threshold | float | Threshold used |
summary | str | Human-readable summary |
interactions | list[VisionHallucinationInteraction] | Per-frame results |
Parameters (both metrics)
| Parameter | Type | Default | Description |
|---|---|---|---|
retriever | type[Retriever] | required | Retriever class |
scorer | SimilarityScorer | Cosine + mpnet | Similarity scoring strategy |
threshold | float | 0.75 | Hallucination threshold |
Custom scorer
Expected batch format
Requires the
vision extra: pip install "gaussia[vision]".