What is F1 Score?
F1 score is the harmonic mean of precision and recall, producing a single number between 0 and 1 that summarizes how well a binary classifier performs. It penalizes models that are strong on one side (catching every positive but with many false alarms, or being very precise but missing real positives) and rewards models that balance both.
What is an example of F1 Score in use?
A voice activity detector is evaluated by how often it correctly flags speech frames as speech (recall) and how often the frames it flags are actually speech (precision). F1 score combines the two. A VAD with 0.95 precision and 0.90 recall lands at an F1 of about 0.92 — stronger than a model at 0.99 precision but 0.70 recall, which would only score around 0.82.
How does ai-coustics use F1 Score?
F1 score is one of the metrics we use to evaluate Quail VAD and related detection components, where correctly identifying speech onset and offset frames matters for downstream endpointing, turn-taking, and barge-in. It sits alongside WER as part of how we benchmark the reliability of the audio intelligence layer.
