# Gaussia ## Docs - [Contributing](https://docs.gaussia.ai/development.md): How to contribute to Gaussia — from proposing metrics to building SDKs. - [Introduction](https://docs.gaussia.ai/index.md): Scientific metrics for intelligent behaviors. If you can't trace it to a paper, it's not a metric — it's an opinion. - [Explainability](https://docs.gaussia.ai/sdks/python/advanced/explainability.md): Analyze token-level attributions to understand which input tokens drive model outputs - [Generators](https://docs.gaussia.ai/sdks/python/advanced/generators.md): Generate synthetic evaluation datasets from context documents using LLM-powered generation - [Prompt optimizer](https://docs.gaussia.ai/sdks/python/advanced/prompt-optimizer.md): Optimize LLM prompts using GEPA and MIPROv2 algorithms - [Architecture](https://docs.gaussia.ai/sdks/python/concepts/architecture.md): Understanding Gaussia's core architecture and design patterns - [Datasets and batches](https://docs.gaussia.ai/sdks/python/concepts/datasets.md): Understand the Dataset and Batch data models that structure conversation data for evaluation - [LLM judge](https://docs.gaussia.ai/sdks/python/concepts/llm-judge.md): Use any LangChain-compatible model as an evaluation judge for metric scoring - [Retriever](https://docs.gaussia.ai/sdks/python/concepts/retriever.md): Implement custom data retrievers to load conversation data from any source - [Statistical modes](https://docs.gaussia.ai/sdks/python/concepts/statistical-modes.md): Choose between frequentist point estimates and Bayesian credible intervals for metric aggregation - [Introduction](https://docs.gaussia.ai/sdks/python/index.md): Gaussia is a comprehensive performance-measurement library for evaluating AI models and assistants - [Installation](https://docs.gaussia.ai/sdks/python/installation.md): Install Gaussia and its dependencies - [Agentic](https://docs.gaussia.ai/sdks/python/metrics/agentic.md): Evaluate AI agent responses with pass@K metrics, tool correctness, and pluggable statistical modes - [BestOf](https://docs.gaussia.ai/sdks/python/metrics/best-of.md): Tournament-style comparison of multiple AI assistants using king-of-the-hill evaluation - [Bias](https://docs.gaussia.ai/sdks/python/metrics/bias.md): Detect bias across protected attributes using guardian-based analysis - [Context](https://docs.gaussia.ai/sdks/python/metrics/context.md): Evaluate how well AI responses align with provided context, with session-level aggregation and pluggable statistical modes - [Conversational](https://docs.gaussia.ai/sdks/python/metrics/conversational.md): Evaluate dialogue quality using Grice's Maxims with session-level aggregation and pluggable statistical modes - [Humanity](https://docs.gaussia.ai/sdks/python/metrics/humanity.md): Measure emotional profiling and entropy of AI assistant responses using NRC emotion lexicons - [Metrics Overview](https://docs.gaussia.ai/sdks/python/metrics/overview.md): Overview of all available metrics in Gaussia - [Regulatory](https://docs.gaussia.ai/sdks/python/metrics/regulatory.md): Evaluate AI response compliance against a regulatory document corpus - [Toxicity](https://docs.gaussia.ai/sdks/python/metrics/toxicity.md): Measure toxic language with clustering and demographic group profiling - [Vision](https://docs.gaussia.ai/sdks/python/metrics/vision.md): Evaluate vision-language model descriptions for similarity and hallucination detection - [Quickstart](https://docs.gaussia.ai/sdks/python/quickstart.md): Get started with Gaussia in minutes ## OpenAPI Specs - [openapi](https://docs.gaussia.ai/api-reference/openapi.json) ## Optional - [GitHub](https://github.com/gaussia-labs) - [Papers](https://github.com/gaussia-labs/papers)