What is an utterance?
An utterance is a continuous segment of speech from a single speaker, typically bounded by silences or turn transitions. It is the basic unit that ASR, endpointing, and voice agents operate on.
What is an example of an utterance?
When a caller says "I need to reschedule my appointment for next Tuesday," pauses, and then says "In the afternoon if possible," that is two utterances. The voice agent endpoints each one, transcribes it, and decides when to respond.
How does ai-coustics handle utterances?
Quail VAD provides reliable speech detection at the frame level, which is what endpointing logic depends on to decide where one utterance ends and the next begins. Combined with Quail Voice Focus, which ensures only the main speaker's utterances are detected in the first place, this keeps voice agents from over-transcribing background voices or triggering on the wrong turns.
