What is Dawn Chorus?
Dawn Chorus (dawn_chorus_en) is ai-coustics' open-source evaluation dataset for benchmarking accurate foreground speaker transcription in real-world conditions. Released on Hugging Face, it measures how well speech enhancement and target-speaker extraction systems preserve a primary speaker while suppressing competing background voices.
How does Dawn Chorus work?
The dataset contains 450 samples totalling around 90 minutes, each pairing a foreground–background speech mixture (mix) with the isolated foreground audio (speech) and a human-verified transcript. Recordings are 16 kHz, 16-bit mono, captured through realistic transmission channels, played through an artificial mouth into real mobile devices. Foreground speech is 65% recorded and 35% synthesized to reflect modern conversational contexts, and competing background speech is played simultaneously through an immersive loudspeaker setup. The format is inspired by DAPS and built around time-aligned references, so teams can measure both suppression strength and foreground speech distortion.
How does ai-coustics use Dawn Chorus?
Dawn Chorus addresses a gap in voice AI evaluation - no other open dataset stresses primary-speaker isolation under realistic telephony conditions. We use it internally to benchmark Quail Voice Focus and the wider Quail family against real-world background-speech interference, and we've released it publicly (CC BY-NC 4.0) so anyone building voice agents, contact centers, or STT pipelines can measure their own stack the same way.
See more on HuggingFace.
