GPT-Realtime-2 Turns Voice Agents Into Tool-Using Reasoning Systems
OpenAI’s Build Hour on GPT-Realtime-2 presented the new realtime voice release as a shift from conversational voice interfaces toward tool-using, stateful agents. Teri Yu and Erika Kettleson argued that GPT-realtime-2’s larger context window, stronger instruction following, parallel tool calling and controllable speech behavior let developers build voice systems that can operate apps, reason across workflows and know when not to speak. Sierra’s Ken Murphy and Soham Ray added that production voice agents still depend on the surrounding system: guardrails, tuned turn-taking, tracing, redaction, evaluations and customer-specific workflows.
OpenAI·May 13, 2026·14 min read