TL;DR
GPT-5 introduces a fundamentally different approach to inference. Instead of forcing developers to manually configure reasoning parameters, the model operates as a unified system with real-time rou...
Update (March 2026): OpenAI has since released GPT-5.3 and GPT-5.4 with significant improvements. This article covers the original GPT-5 launch.
GPT-5 introduces a fundamentally different approach to inference. Instead of forcing developers to manually configure reasoning parameters, the model operates as a unified system with real-time routing based on query complexity.
Tell it to "think hard" about a difficult problem, and it allocates additional compute. Ask a simple conversational question, and it responds immediately without burning tokens on unnecessary test-time compute. This dynamic routing eliminates the guesswork of selecting between fixed reasoning modes while keeping costs predictable.
OpenAI optimized GPT-5 for practical utility, not just leaderboard scores. The focus areas, writing, coding, and health, represent ChatGPT's most common use cases.
Hallucination rates are down. Instruction following is tighter. But the real difference shows up in qualitative output.
The model demonstrates measurable improvements in front-end development. During demonstrations, GPT-5 generated complete interactive applications: a physics-based ball-rolling game, a pixel art canvas, a typing trainer, a drum simulator, and a lofi music environment. One standout example was a 3JS-style castle defense game with interactive balloon targeting, built entirely from a text prompt within Cursor.
When asked about cancer risk factors, previous models like O3 responded with dry tables and bullet-point citations. GPT-5 leads with empathy: "I'm sorry you're dealing with this worry. Many people have the same question." The information is equally accurate, but the delivery respects the emotional weight of the query.

Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
Artificial Analysis' aggregate Intelligence Index, combining MMLU, GPQA Diamond, Humanity's Last Exam, and Live CodeBench, places GPT-5 (high mode) at state-of-the-art. Even GPT-5 medium outperforms the best competing models.
The efficiency curve is where it gets interesting. GPT-5 low ranks above Claude 4 Sonnet Thinking and approaches Qwen 3 235B, while using significantly fewer tokens. When plotting intelligence against output tokens consumed, GPT-5 dominates the curve, delivering superior results at lower cost and latency than Grok 4.

GPT-5 takes best-in-class status on MMLU Pro, Humanity's Last Exam, AMIE medical evaluations, long-context tasks, and instruction following. GPQA Diamond still belongs to Grok 4. On Live CodeBench, it trails O4 mini (high) and Grok.
LM Arena human preference data shows GPT-5 beating Gemini 2.5 Pro on text responses and dominating WebDev Arena against Gemini 2.5 Pro, DeepSeek R1, and Claude 4 Opus.
ARC-AGI scores put GPT-5 high at 65.7 versus Grok 4's 66.7, but GPT-5 achieves this at roughly half the cost per task.
The GPT-5 family launches with four variants:
| Model | Input | Output | Use Case |
|---|---|---|---|
| GPT-5 | $1.25/M | $10/M | Flagship performance |
| GPT-5 Mini | $0.25/M | $2/M | Balanced speed and capability |
| GPT-5 Nano | Lower cost | Lower cost | Latency-sensitive applications |
| GPT-5 Chat | Optimized | Optimized | Conversational interfaces |
All four support multimodal inputs (text and image), function calling, structured outputs, and streaming. The flagship model adds predicted outputs for efficient code refactoring and text editing workflows.
Context window is 400,000 tokens across the board, with 128,000 max output tokens. Pricing undercuts Grok 4 and Claude 4 Sonnet Thinking ($3/$15 per million) while matching Gemini 2.5 Pro's rates with superior performance.
Cognition's Junior Dev Eval, the benchmark behind the Devin coding agent, shows GPT-5 outperforming Sonnet and GPT-4.1 on exploration, planning, and code execution.
The Cursor CEO publicly called it the best coding model they've used to date. During OpenAI's livestream, the model resolved a GitHub issue in real-time. Both Windsurf and Cursor are offering GPT-5 access to users immediately.

GPT-5 is rolling out to all ChatGPT users today. Plus subscribers receive expanded usage limits. Pro subscribers unlock GPT-5 Pro, the equivalent of API high mode, for extended reasoning on complex problems.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
OpenAI's cloud coding agent. Runs in a sandboxed container, reads your repo, executes tasks, and submits PRs. Uses GPT-5...
View ToolOpenAI's latest flagship model. Major leap in reasoning, coding, and instruction following over GPT-4o. Powers ChatGPT P...
View Tool
New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
OpenAI's flagship. GPT-4o for general use, o3 for reasoning, Codex for coding. 300M+ weekly users. Tasks, agents, web br...
Install Ollama, pull your first model, and run AI locally for coding, chat, and automation - with zero cloud dependency.
Getting StartedInstall the dd CLI and scaffold your first AI-powered app in under a minute.
Getting StartedConfigure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
AI Agents
OpenAI's New GPT Image Model API📸 Today OpenAI released their new GPT Image one model via API! 🌟 Last month, ChatGPT introduced Image Generation, and it quickly became a hit with over...

Exploring OpenAI's New Sora Video Generator: Subscription Tiers and Features In this video, I dive into OpenAI's newly released Sora, part of their third day of the '12 days of OpenAI'. Sora...

Learn The Fundamentals Of Becoming An AI Engineer On Scrimba; https://v2.scrimba.com/the-ai-engineer-path-c02v?via=developersdigest OpenAI's New O1 Model and $200/Month ChatGPT Pro Tier: What's...

Two platforms, two philosophies. Here is how Anthropic and OpenAI compare on APIs, SDKs, documentation, pricing, and the...

Codex runs in a sandbox, reads your TypeScript repo, and submits PRs. Here is how to use it and how it compares to Claud...

State-of-the-art computer use, steerable thinking you can redirect mid-response, and a million tokens of context. GPT 5....