OpenAI GPT-5.5 GA + agentic coding/math SOTA crushing Claude 4.7 + super app
Key Questions
What is GPT-5.5 and when was it released?
GPT-5.5 is OpenAI's latest large language model, launched on August 14, 2026, described as its most capable and intuitive model yet. It upgrades ChatGPT and Codex, focusing on a new class of intelligence for real work. The release positions OpenAI closer to an AI 'super app'.
What benchmark scores does GPT-5.5 achieve?
GPT-5.5 scores 82.7% on Terminal-Bench2.0, 39.6% on FrontierMath, 84.9% on GDPval, 58.6% on SWE-Pro, 78.7% on OSWorld, and 85% on ARC-AGI. These results highlight its strengths in agentic coding, math, and autonomous tasks. It crushes competitors like Claude Opus 4.7 in terminal-bench.
How does GPT-5.5 compare to Claude 4.7?
GPT-5.5 outperforms Claude Opus 4.7 in agentic coding, achieving 82.7% on Terminal-Bench versus Claude's lower scores. It also excels in math and coding benchmarks overall. Early access testers noted its superior capabilities.
What new applications or features does GPT-5.5 enable?
GPT-5.5 supports autonomous apps for work, tax, and gene editing, with improved token efficiency. It builds on GPT-5.4's PC and Codex features. The launch teases a 'super app' integration.
What are the key improvements in GPT-5.5 for coding and math?
GPT-5.5 significantly advances coding with high scores on SWE-Pro and Terminal-Bench, and math on FrontierMath. It is praised for solving complex problems better than predecessors. OpenAI highlights its real-world work intelligence.
GPT-5.5 launches with Terminal-Bench2.0 82.7%/FrontierMath39.6%/GDPval84.9%/SWE-Pro58.6%/OSWorld78.7%/ARC-AGI85%; autonomous work/tax/gene apps/token eff; builds on 5.4 PC/Codex; vs Kimi/Qwen/Claude; super app tease.