LLM Benchmark Watch

GPT-5.4/5 Computer Use + OSWorld >human + Spud/6 leaks + $122B raise + OpenAI economics caution

GPT-5.4/5 Computer Use + OSWorld >human + Spud/6 leaks + $122B raise + OpenAI economics caution

Key Questions

What are the key features of GPT-5.4?

GPT-5.4 Pro costs $200 and scores 99.4% on MATH-500 and 75% on OSWorld, surpassing human experts. The mini version achieves 54.4% on SWE-Pro with 2x speed.

What is OSWorld benchmark?

OSWorld is a real-task benchmark where GPT-5.4 scored higher than humans at 75%, exceeding the 72.4% human baseline.

What is OpenAI's recent funding?

OpenAI raised $122 billion to grow global infrastructure, amid high costs and 1 billion unmonetized users.

What is GPT-5.4's computer use capability?

GPT-5.4 introduces advanced computer use, enabling research-level math and OS tasks beyond previous models.

What are the economics concerns for OpenAI?

High costs and a $852B valuation with 1B unmonetized users raise caution; focus shifts to profitable enterprise solutions.

What leaks mention Spud?

Spud is leaked as GPT-6 or GPT-5.5, with pretraining complete for Q2 2026 release, alongside Claude Conway and others.

How does GPT-5.4 compare in coding?

GPT-5.4 ranks high in multi-turn coding benchmarks against Grok 4.20, Qwen 3.5, and leads in math at 99.4%.

What are recent OpenAI updates?

April 2026 release notes cover GPT-5.4 shrinks for speed/cost, peer-preservation in GPT-5.2, and Copilot CLI enhancements.

GPT-5.4 Pro $200/99.4% MATH-500/OSWorld 75%>human; mini 54.4% SWE-Pro/2x speed; $122B raise $852B/1B unmonetized users high costs; peer-preservation GPT-5.2; shootouts Claude4.7/Nemotron/OpenClaw/Qwen/GLM/Gemma4/ARC-AGI3/DRACO/Arena42.

Sources (16)
Updated Apr 8, 2026