idle
0ms
Scenario
TTS Model
Voice
Conversation
Background Agent
Latency
About
The fastest150ms median time-to-first-audio, measured from end-of-user-turn to first audio byte at the client. voice agent in the world. TTFATime to first audio. as low as 150ms.
- Runs on the edge using Cloudflare Workers
- Speech-to-text, text generation, and text-to-speech are pipelined and streamed. The agent begins responding before you finish hearing silence
- Processing starts speculatively, text generation starts when the user pauses, discarding the result if they continue
- Tool calls are orchestrated in parallel by a separate agent so the user never waits for a reply