What is ground truth?
Ground truth is the reference label or reference output that a model's predictions are compared against during evaluation. It is treated as correct by definition, and all accuracy metrics are measured relative to it.
What is an example of ground truth?
When evaluating an ASR system, the ground truth is the human-verified transcript of the audio. WER is computed by comparing the model's transcript against that reference. For VAD, the ground truth is a frame-by-frame annotation of which frames actually contain speech.
How does ground truth work?
Ground truth is typically produced by human annotation, careful measurement, or (in synthetic datasets) by construction. Metrics like WER, F1, and balanced accuracy then quantify how closely a model's output matches this reference. The quality of the ground truth sets a hard ceiling on how meaningful those metrics are.
How does ai-coustics use ground truth?
Our Dawn Chorus evaluation dataset is built around carefully curated ground truth transcripts, which let us report precise WER uplift when Quail is run in front of ASR providers like Deepgram, Gladia, ElevenLabs, and Cartesia. Reliable ground truth is what makes those benchmarks credible.
