OpenAI Launches GPT-Realtime-2 to Power Smarter Voice Apps

POST

MAIL

Posted May 7, 2026 at 7:49pm by Shalom Levytam

OpenAI is rolling out three new audio models to its API, giving developers new tools to build voice apps that can reason through requests, translate conversations in real time, and generate live transcriptions with lower latency.

OpenAI Launches GPT-Realtime-2 to Power Smarter Voice Apps

The centerpiece of the release is GPT-Realtime-2, OpenAI's first voice model built with GPT-5-class reasoning. The model is designed for live conversations, allowing applications to handle interruptions, corrections, and changing context in real time. OpenAI says developers can also enable short verbal preambles like "let me check that" while the model performs actions in the background using parallel tool calls.

GPT-Realtime-2 expands the context window from 32K to 128K tokens, enabling longer and more coherent conversations. Developers can also adjust reasoning levels from minimal to xhigh depending on whether they want faster responses or deeper reasoning for more complex tasks.

OpenAI is also introducing GPT-Realtime-Translate, a realtime translation model that converts speech from more than 70 input languages into 13 output languages while keeping pace with the speaker. The company says the model is designed to preserve meaning and context even during fast-moving conversations or when users shift context during a conversation.

For transcription tasks, GPT-Realtime-Whisper delivers streaming speech-to-text for live captions, meeting notes, and voice-driven workflows. OpenAI says the low-latency model is intended for applications that need continuous speech recognition while conversations are still happening.

The announcement is particularly relevant for Apple developers as voice interfaces continue expanding across iPhone, CarPlay, and Mac experiences. Developers are already leveraging new entitlements to run conversational AI apps directly on the CarPlay dashboard, while Apple is preparing an extension system to support third-party assistants in iOS 27.

To help developers integrate the models, OpenAI has tied the release into its Codex app for Mac, which OpenAI says can help developers integrate the new Realtime API into applications. Codex recently gained the ability to operate the Mac desktop in the background, fitting into broader AI-assisted development workflows alongside tools like Xcode 26.3.

All three models are available now. GPT-Realtime-2 costs $32 per million audio input tokens and $64 per million output tokens. GPT-Realtime-Translate is priced at $0.034 per minute, while GPT-Realtime-Whisper costs $0.017 per minute.

Get the iClarified Daily Newsletter

Apple news, rumors, tutorials, price drop alerts, in your inbox every evening, free.

Unsubscribe at any time.