Here at ai-coustics, our mission is to empower developers to build Voice AI that actually works. Our real-time speech enhancement SDKs fix audio input for voice agents, conferencing solutions, and much more. Today, we’re introducing and explaining our new naming conventions, breaking down the different models behind our successful SDK, and making it easy for you to find the solution your product needs.
By using speech enhancement, developers are looking to:
- Improve stacks for voice agents and other machine-learning tools so that bad quality audio or challenging acoustic conditions don’t cause issues with your VAD, STT, or other downstream tasks.
- Enhance perceptual performance for human ears, so that audio quality remains high for your users across conferencing, media, human-to-human calls, and other use cases.
The ai-coustics real-time speech enhancement SDK solves both of these issues – but in different ways, with different models. Quail is exclusively machine-targeted, optimizing audio for downstream systems such as ASR and VAD. Sparrow improves the human listening experience.
The Quail family: Real-time audio enhancement to boost Voice AI performance
You’re probably already familiar with Quail, our flagship SDK model. Quail is available in a range of models to suit specific use cases and voice agent needs, including:
- Quail: Quail is designed to boost your Automatic Speech Recognition (ASR) systems, providing a safeguard against real-world audio issues like background noise, reverb, accents and low-quality microphones. It results in an up to 30% drop in Word Error Rates (WER).
Read more about Quail. - Quail Voice Focus: Background voices confuse voice agents, resulting in missed cues, interruption, or silence. Quail Voice Focus isolates your user’s voice, suppresses competing voices, and keeps the acoustic cues required for reliable transcription.
Read more about Quail Voice Focus.
- Quail VAD: Traditional VAD solutions, like Silero VAD, struggle with dynamic or sudden noise types and require extra preprocessing steps, adding complexity and expense to your voice agent stack. Quail VAD is tailored for voice agent stacks to solve both problems in one lightweight solution.
Read more about Quail VAD.
Sparrow: Improve audio quality for human ears
Available in a range of sizes and sample rates to fit your product best, Sparrow is designed for real-time speech enhancement on audio devices and for streaming applications. Suitable for a massive range of use cases, it makes an immediate difference to audio quality and ensures that your users can enjoy clear, natural speech on either end of a connection. It is ideal for live conferencing, voice AI agents, communication, audio devices, streaming, broadcast technology, and privacy-sensitive environments.
Sparrow isolates a user’s voice from dynamic and noisy environments, removing reverberation, background noise, and other audio quality concerns. At the same time, it preserves the natural quality and timbre of a speaker’s voice, so that human ears don’t detect any falsity or ‘machine’ notes.
What does this mean for my ai-coustics product?
If you’re already a user of our SDK, nothing changes in terms of product behavior. This is primarily a shift in naming to reflect the spectrum of our audio enhancement solutions, making it easier to see which product (or product family) best suits your needs.
That said, with last week’s SDK update and the upcoming mandatory upgrade, you’ll need to switch to the new model IDs and update to the latest SDK, including generating a new SDK key. The upgrade also brings major improvements like weight separation (up to ~90% smaller binaries), separate model downloads, more efficient multi-stream sharing, thread-safe control APIs, and flexible model loading.
You can find the migration instructions here.
Where can I try both Quail and Sparrow?
You can test Quail and Sparrow for free in our developer portal or reach out if you’d like a personalized demo.


