************************xAI Grok Speech APIs STT/TTS SOTA pricing & accuracy + Grok 4.3 beta******************** [developing]** [developing]
Key Questions
What are xAI's Grok Speech APIs?
Grok Speech APIs provide STT (speech-to-text) and TTS (text-to-speech) with SOTA pricing and accuracy, powered by Colossus and Tesla infra. They support diarization, prosody, and handle phone entities, video, podcasts.
How does Grok Speech APIs pricing compare?
Pricing undercuts competitors by 60% at $0.10/hour for batch STT and $4.20 per million characters for TTS. It offers dev perks for building voice agents.
What is the performance of Grok Speech APIs?
Achieves 5% WER on phone entities and 2.4% on video/podcasts, tying trends post-Gemini Flash TTS. Status is developing with strong accuracy metrics.
Grok Speech APIs launch undercuts competitors 60% ($0.10/hr batch/$4.20/M chars, 5% WER phone entities/2.4% video/podcasts/diarization/prosody/Colossus/Tesla infra); Grok 4.3 beta web Early Access (reasoning/coding/math/real-time data/witty edge), signals agentic push amid Claude/GPT races. Ties speech trends post-Gemini Flash TTS. Dev perks for voice agents.