Realtime Voice Intelligence API Launch

Key Questions

When were the new realtime voice models launched?

The models launched on May 7. They include GPT-Realtime-2, Realtime-Translate, and Realtime-Whisper for low-latency applications.

What is GPT-Realtime-2?

GPT-Realtime-2 uses GPT-5 reasoning, 128K context, and tools for complex real-time conversations. It builds on GPT-5.5 multimodal for voice agents.

What does Realtime-Translate support?

Realtime-Translate supports 70+ input languages into 13 outputs for live speech translation. It enables real-time multilingual voice apps.

What is Realtime-Whisper?

Realtime-Whisper provides streaming speech-to-text (STT). It enhances real-time voice tasks for enterprise and apps.

Who are the new voice models for?

They target developers building voice agents, enterprise solutions, and apps. They improve on previous audio models for lower latency.

How do these models integrate with existing tech?

They work with end-of-utterance detection, audio-to-text conversion, LLM calls, and response generation. Available now in the API.

What makes these models GPT-5-class?

GPT-Realtime-2 brings GPT-5-level reasoning to voice. The trio supports advanced real-time tasks like reasoning and translation.

Are the new audio models available in the API?

Yes, they are available for speech-to-text and text-to-speech. Developers can build more powerful, customizable voice experiences.

New models May 7: GPT-Realtime-2 (GPT-5 reasoning/128K ctx/tools for complex real-time convos), Realtime-Translate (70+ input langs), Realtime-Whisper (streaming STT); for voice agents/enterprise/apps, builds on GPT-5.5 multimodal.

Sources (8)

Updated May 9, 2026

OpenAI Product Pulse

Realtime Voice Intelligence API Launch

Key Questions

When were the new realtime voice models launched?

What is GPT-Realtime-2?

What does Realtime-Translate support?

What is Realtime-Whisper?

Who are the new voice models for?

How do these models integrate with existing tech?

What makes these models GPT-5-class?

Are the new audio models available in the API?

OpenAI Introduces Trio of Specialized Audio Models for Low-Latency Voice Applications

Introducing next-generation audio models in the API

OpenAI launches GPT-5-class realtime voice models

OpenAI’s New Model Translates 70+ Languages In Realtime Speech

OpenAI just dropped three new voice models that could level up ChatGPT

shipping OpenAI's new voice intelligence API features the right way

OpenAI GPT-Realtime-2: new voice models for AI agents | Heyloha Blog

OpenAI unveils three audio models for real-time voice tasks | Reuters