TL;DR
GPT-5 introduces a fundamentally different approach to inference. Instead of forcing developers to manually configure reasoning parameters, the model operates as a unified system with real-time rou...
Read next
State-of-the-art computer use, steerable thinking you can redirect mid-response, and a million tokens of context. GPT 5.4 is OpenAI's most capable model yet.
8 min readOpenAI is drawing a line in the sand. GPT-5 Codex is not an API release.
7 min readA developer's comparison of OpenAI and Anthropic ecosystems - models, coding tools, APIs, pricing, and which to choose for different use cases.
10 min readUpdate (March 2026): OpenAI has since released GPT-5.3 and GPT-5.4 with significant improvements. This article covers the original GPT-5 launch.
GPT-5 introduces a fundamentally different approach to inference. Instead of forcing developers to manually configure reasoning parameters, the model operates as a unified system with real-time routing based on query complexity.
For model-selection context, compare this with OpenAI Codex: Cloud AI Coding With GPT-5.3 and OpenAI vs Anthropic in 2026 - Models, Tools, and Developer Experience; the useful question is not only benchmark quality, but where the model fits in a real developer workflow.
Tell it to "think hard" about a difficult problem, and it allocates additional compute. Ask a simple conversational question, and it responds immediately without burning tokens on unnecessary test-time compute. This dynamic routing eliminates the guesswork of selecting between fixed reasoning modes while keeping costs predictable.
OpenAI optimized GPT-5 for practical utility, not just leaderboard scores. The focus areas, writing, coding, and health, represent ChatGPT's most common use cases.
Hallucination rates are down. Instruction following is tighter. But the real difference shows up in qualitative output.
The model demonstrates measurable improvements in front-end development. During demonstrations, GPT-5 generated complete interactive applications: a physics-based ball-rolling game, a pixel art canvas, a typing trainer, a drum simulator, and a lofi music environment. One standout example was a 3JS-style castle defense game with interactive balloon targeting, built entirely from a text prompt within Cursor.
When asked about cancer risk factors, previous models like O3 responded with dry tables and bullet-point citations. GPT-5 leads with empathy: "I'm sorry you're dealing with this worry. Many people have the same question." The information is equally accurate, but the delivery respects the emotional weight of the query.

Artificial Analysis' aggregate Intelligence Index, combining MMLU, GPQA Diamond, Humanity's Last Exam, and Live CodeBench, places GPT-5 (high mode) at state-of-the-art. Even GPT-5 medium outperforms the best competing models.
The efficiency curve is where it gets interesting. GPT-5 low ranks above Claude 4 Sonnet Thinking and approaches Qwen 3 235B, while using significantly fewer tokens. When plotting intelligence against output tokens consumed, GPT-5 dominates the curve, delivering superior results at lower cost and latency than Grok 4.

GPT-5 takes best-in-class status on MMLU Pro, Humanity's Last Exam, AMIE medical evaluations, long-context tasks, and instruction following. GPQA Diamond still belongs to Grok 4. On Live CodeBench, it trails O4 mini (high) and Grok.
LM Arena human preference data shows GPT-5 beating Gemini 2.5 Pro on text responses and dominating WebDev Arena against Gemini 2.5 Pro, DeepSeek R1, and Claude 4 Opus.
ARC-AGI scores put GPT-5 high at 65.7 versus Grok 4's 66.7, but GPT-5 achieves this at roughly half the cost per task.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
The GPT-5 family launches with four variants:
| Model | Input | Output | Use Case |
|---|---|---|---|
| GPT-5 | $1.25/M | $10/M | Flagship performance |
| GPT-5 Mini | $0.25/M | $2/M | Balanced speed and capability |
| GPT-5 Nano | Lower cost | Lower cost | Latency-sensitive applications |
| GPT-5 Chat | Optimized | Optimized | Conversational interfaces |
All four support multimodal inputs (text and image), function calling, structured outputs, and streaming. The flagship model adds predicted outputs for efficient code refactoring and text editing workflows.
Context window is 400,000 tokens across the board, with 128,000 max output tokens. Pricing undercuts Grok 4 and Claude 4 Sonnet Thinking ($3/$15 per million) while matching Gemini 2.5 Pro's rates with superior performance.
Cognition's Junior Dev Eval, the benchmark behind the Devin coding agent, shows GPT-5 outperforming Sonnet and GPT-4.1 on exploration, planning, and code execution.
The Cursor CEO publicly called it the best coding model they've used to date. During OpenAI's livestream, the model resolved a GitHub issue in real-time. Both Windsurf and Cursor are offering GPT-5 access to users immediately.

GPT-5 is rolling out to all ChatGPT users today. Plus subscribers receive expanded usage limits. Pro subscribers unlock GPT-5 Pro, the equivalent of API high mode, for extended reasoning on complex problems.
GPT-5 and Claude 4 (Opus, Sonnet) represent different design philosophies. GPT-5 leads on coding benchmarks, front-end development, and multimodal tasks. Claude 4 Opus excels at long-form writing, nuanced reasoning, and tasks requiring extended context. For pure coding performance in tools like Cursor, GPT-5 edges ahead. For agentic workflows with complex instructions, Claude often follows directions more reliably.
GPT-5 flagship costs $1.25 per million input tokens and $10 per million output tokens. GPT-5 Mini runs at $0.25/$2 per million. This undercuts Grok 4 and Claude 4 Sonnet Thinking ($3/$15) while delivering competitive or superior performance. ChatGPT Plus subscribers get GPT-5 access included; Pro subscribers unlock GPT-5 Pro with extended reasoning.
GPT-5 supports a 400,000 token context window with up to 128,000 max output tokens. This matches the largest context windows available in 2026 and supports complex codebases, long documents, and multi-file analysis without chunking.
Yes. GPT-5, GPT-5 Mini, GPT-5 Nano, and GPT-5 Chat are all available via the OpenAI API. All variants support multimodal inputs (text and image), function calling, structured outputs, and streaming. The flagship model adds predicted outputs for efficient code refactoring.
Yes. Cursor integrated GPT-5 on launch day. The Cursor CEO called it "the best coding model they've used to date." GPT-5 is available as a model option in Cursor settings, and Windsurf also offers GPT-5 access.
OpenAI skipped the GPT-4.5 naming. The progression went from GPT-4 Turbo and GPT-4o to GPT-5, reflecting the significant architectural changes rather than an incremental update. The unified inference architecture with dynamic reasoning routing represented a larger leap than typical point releases.
GPT-5 matches Gemini 2.5 Pro's pricing ($1.25/$10 per million tokens for flagship) while outperforming it on most benchmarks. LM Arena human preference data shows GPT-5 beating Gemini 2.5 Pro on both text responses and WebDev tasks. Gemini retains advantages in certain multimodal scenarios and Google ecosystem integration.
GPT-5 Pro is the extended reasoning mode available to ChatGPT Pro subscribers. It allocates additional compute for complex problems, equivalent to the API's "high" reasoning mode. Standard GPT-5 dynamically routes between reasoning modes based on query complexity, while GPT-5 Pro forces maximum reasoning allocation.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
OpenAI's latest flagship model. Major leap in reasoning, coding, and instruction following over GPT-4o. Powers ChatGPT P...
View ToolOpenAI's flagship. GPT-4o for general use, o3 for reasoning, Codex for coding. 300M+ weekly users. Tasks, agents, web br...
View ToolOpenAI's open-source terminal coding agent built in Rust. Runs locally, reads your repo, edits files, and executes comma...
View ToolThe easiest way to run LLMs locally. One command to pull and run any model. OpenAI-compatible API. 52M+ monthly download...
View ToolPick a model in 30 seconds. Built for the answer, not the marketing.
View AppBeat the August 2026 Assistants API sunset. Paste old code, get Responses API.
View AppTry AI models in the browser before paying for a single token.
View AppInstall Ollama and LM Studio, pull your first model, and run AI locally for coding, chat, and automation - with zero cloud dependency.
Getting StartedUse opus, sonnet, haiku, and best to switch models easily.
Claude CodeInstall the dd CLI and scaffold your first AI-powered app in under a minute.
Getting Started
State-of-the-art computer use, steerable thinking you can redirect mid-response, and a million tokens of context. GPT 5....

OpenAI is drawing a line in the sand. GPT-5 Codex is not an API release.

A developer's comparison of OpenAI and Anthropic ecosystems - models, coding tools, APIs, pricing, and which to choose f...

OpenAI's April 2026 Codex changelog shows a clear product shift: Codex is becoming a full agent workspace with goals, br...

Inception Labs shipped the first reasoning model built on diffusion instead of autoregressive generation. Over 1,000 tok...

Two platforms, two philosophies. Here is how Anthropic and OpenAI compare on APIs, SDKs, documentation, pricing, and the...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.