Gaussia base class is the entry point for evaluation. It defines the processing flow and delegates the per-batch work to your subclass. This is the Template Method pattern: the base class owns the lifecycle, and you implement one method.
The batch method
ExtendGaussia and implement the abstract batch method. The base class calls it once per unit of work and collects whatever you push to this.metrics.
Gaussia<number> is the metric type collected in this.metrics and returned by run.
Batch input
Each call tobatch receives:
Identifier of the session being evaluated.
The session context (for example, the source document or system context).
Identifier of the assistant under evaluation.
The conversation batches for this unit of work.
The session language, or
null when unspecified.Running an evaluation
Call the staticrun method with your retriever class and its config. The base class constructs the retriever, loads the dataset, processes it, and returns the collected metrics.
run accepts an optional third argument for options such as a Logger.
Iteration levels
The retriever’siterationLevel controls how the dataset is consumed:
full_dataset(default):loadDatasetreturnsDataset[], processed in order.stream_sessions:loadDatasetreturns an async iterable ofDataset, processed as they arrive.stream_batches:loadDatasetreturns an async iterable ofStreamedBatch, processed one batch at a time.
full_dataset, the constructor throws a RetrieverError. Set a streaming level when you stream.
Hooks
onProcessCompleteruns after all batches are processed. Override it to finalize aggregate metrics. The default is a no-op.resolveWeightsnormalizes per-batch weights to sum to 1, falling back to uniform weights (with a logged warning) when explicit weights are missing or inconsistent.
Related concepts
Schemas
Dataset, Batch, and the rest of the data model.Language models
Call a model from inside your
batch implementation.