Welcome to Gaussia
Gaussia is a performance-measurement library developed by Gaussia Labs for evaluating AI models and assistants. It provides comprehensive metrics for fairness, toxicity, bias, conversational quality, and more.Why Gaussia?
As AI systems become increasingly integrated into our daily lives, ensuring they behave fairly, safely, and effectively is paramount. Gaussia provides:- Fairness Evaluation: Detect and measure bias across protected attributes
- Toxicity Analysis: Identify toxic language patterns with demographic profiling
- Conversational Quality: Evaluate dialogue using Grice’s Maxims
- Context Awareness: Measure how well responses align with provided context
- Emotional Intelligence: Analyze emotional depth and human-likeness
- Model Comparison: Run tournament-style evaluations between multiple assistants
- Agent Evaluation: Measure agent correctness with pass@K metrics
- Vision Evaluation: Detect VLM hallucinations and measure similarity
- Regulatory Compliance: Evaluate responses against regulatory corpus
Key Features
Multiple Metrics
Nine specialized metrics for comprehensive AI evaluation
Statistical Modes
Choose between Frequentist and Bayesian statistical approaches
Test Generation
Generate synthetic test datasets from your documentation
Streaming Support
Process datasets in full, by session, or by individual QA batch
Quick Example
Architecture Overview
Gaussia follows a simple yet powerful architecture:
All metrics inherit from the
Gaussia base class and implement the batch() method to process conversation batches. Users provide data through custom Retriever implementations.
Next Steps
Quickstart
Get started with Gaussia in minutes
Installation
Install Gaussia and dependencies
Core Concepts
Learn the fundamental concepts
Metrics
Explore available metrics