Skip to main content

Overview

The Regulatory metric evaluates whether AI responses comply with a regulatory corpus (laws, policies, guidelines). It uses a RAG-based pipeline to:
  1. Retrieve relevant regulatory chunks for each interaction
  2. Check for contradictions between the response and the retrieved chunks
  3. Score compliance per interaction and aggregate per session

Verdicts

VerdictMeaning
COMPLIANTResponse supports regulatory requirements
NON_COMPLIANTResponse contradicts regulatory requirements
IRRELEVANTNo relevant regulatory content found

Usage

from gaussia.metrics.regulatory import Regulatory
from gaussia.connectors import MyCorpusConnector
from gaussia.embedders import SentenceTransformerEmbedder
from gaussia.rerankers import MyReranker

embedder = SentenceTransformerEmbedder(model="all-mpnet-base-v2")
reranker = MyReranker()
corpus = MyCorpusConnector(path="./regulations/")

results = Regulatory.run(
    MyRetriever,
    corpus_connector=corpus,
    embedder=embedder,
    reranker=reranker,
)

for r in results:
    print(f"Compliance: {r.compliance_score:.2f} ({r.verdict})")
    print(f"Supporting: {r.total_supporting_chunks}, Contradicting: {r.total_contradicting_chunks}")

Parameters

ParameterTypeDefaultDescription
retrievertype[Retriever]requiredRetriever class
corpus_connectorCorpusConnectorrequiredLoader for regulatory documents
embedderEmbedderrequiredText embedder for retrieval
rerankerRerankerrequiredReranker for contradiction detection
statistical_modeStatisticalModeFrequentistMode()Statistical computation mode
chunk_sizeint1000Characters per chunk
chunk_overlapint100Overlap between chunks
top_kint10Max chunks to retrieve
similarity_thresholdfloat0.3Minimum cosine similarity for retrieval
contradiction_thresholdfloat0.6Score below which a chunk contradicts
compliance_thresholdfloat0.5Minimum score for COMPLIANT verdict

Output schema

RegulatoryMetric

FieldTypeDescription
session_idstrSession identifier
assistant_idstrAssistant identifier
n_interactionsintInteractions evaluated
compliance_scorefloatAggregated compliance score
compliance_score_ci_lowfloat | NoneLower CI (Bayesian only)
compliance_score_ci_highfloat | NoneUpper CI (Bayesian only)
verdictstrSession-level verdict
total_supporting_chunksintTotal supporting evidence
total_contradicting_chunksintTotal contradicting evidence
interactionslist[RegulatoryInteraction]Per-interaction breakdown
Requires the regulatory extra: pip install "gaussia[regulatory]".