Frontier Model API Pricing, June 2026: Claude vs OpenAI vs Gemini vs DeepSeek

Last updated: June 11, 2026

Sticker prices for model APIs have never been this spread out. At the top of the June 2026 table, GPT-5.5-pro charges $180 per million output tokens. At the bottom, DeepSeek V4 Flash charges $0.28. That is a 643x gap between two models you can call with nearly identical OpenAI-style request bodies.

This post is the raw API rate card: every current frontier and workhorse model from Anthropic, OpenAI, Google, and DeepSeek, with every number pulled from the live first-party pricing page and stamped with a verification date. If you are pricing coding tool subscriptions like Claude Code or Cursor plans instead, that is a different market with different math - see the companion post on AI coding tool pricing.

Three structural observations matter more than any single row: the DeepSeek output floor, Anthropic dropping its long-context premium while Gemini kept tiering, and the Claude tokenizer change that quietly skews per-MTok comparisons.

The Cross-Provider Price Table

All prices in USD per million tokens (MTok), standard synchronous API, verified June 11, 2026 against each provider's live pricing page (linked per section below).

Model	Input	Cached input read	Output
Claude Fable 5	$10.00	$1.00	$50.00
Claude Opus 4.8	$5.00	$0.50	$25.00
Claude Sonnet 4.6	$3.00	$0.30	$15.00
Claude Haiku 4.5	$1.00	$0.10	$5.00
GPT-5.5-pro	$30.00	n/a	$180.00
GPT-5.5	$5.00	$0.50	$30.00
GPT-5.4	$2.50	$0.25	$15.00
GPT-5.4-mini	$0.75	$0.075	$4.50
GPT-5.4-nano	$0.20	$0.02	$1.25
Gemini 3.1 Pro Preview	$2.00 (over 200K: $4.00)	$0.20 (over 200K: $0.40) plus storage	$12.00 (over 200K: $18.00)
Gemini 3.5 Flash	$1.50	$0.15 plus storage	$9.00
Gemini 3 Flash Preview	$0.50	$0.05 plus storage	$3.00
DeepSeek V4 Pro	$0.435	$0.003625	$0.87
DeepSeek V4 Flash	$0.14	$0.0028	$0.28

Source pages, all verified June 11, 2026:

Claude: platform.claude.com pricing
OpenAI: developers.openai.com API pricing
Gemini: ai.google.dev Gemini API pricing
DeepSeek: api-docs.deepseek.com pricing

A few quick reads off the table. DeepSeek V4 Flash output at $0.28 is roughly 178x cheaper than Claude Fable 5 output at $50, and roughly 107x cheaper than GPT-5.5 at $30 (all verified June 11, 2026 on the pages above). Claude Opus 4.8 and GPT-5.5 are at parity on input at $5, with Claude cheaper on output ($25 vs $30). And Gemini caching is the odd one out: it bills a per-token cache rate plus an hourly storage fee ($1.00 per MTok per hour on the Flash models, $4.50 on 3.1 Pro Preview), so a cache you hold but rarely hit can cost more than no cache.

Three Things the Table Does Not Tell You

The Claude tokenizer makes per-MTok prices understate cost

Anthropic's pricing page carries a note that is easy to skim past: Opus 4.7 and later models, including Fable 5, use a new tokenizer that "may use up to 35% more tokens for the same fixed text" (Anthropic pricing docs, verified June 11, 2026). The models overview puts the typical figure at roughly 30% (verified June 11, 2026).

This matters for cross-provider comparisons. Fable 5 versus Opus 4.8 is apples to apples, since both use the new tokenizer. But if you compare either against Sonnet 4.5-era baselines, or against another provider using your old token counts, the same prompt now consumes up to a third more billable tokens. The honest comparison requires re-counting your actual prompts with Anthropic's token counting endpoint, not multiplying old counts by new rates. For workload-level math, see Fable 5 production cost modeling.

Anthropic dropped the long-context premium, Gemini did not

Claude Fable 5, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6 now include the full 1M token context window at standard pricing. Anthropic's pricing page states it plainly: a 900K-token request bills at the same per-token rate as a 9K-token request, and caching and batch discounts apply at standard rates across the full window (verified June 11, 2026).

Gemini still tiers. Gemini 3.1 Pro Preview doubles input from $2 to $4 and lifts output from $12 to $18 once a prompt crosses 200K tokens, and the cache rate doubles too (Gemini API pricing, verified June 11, 2026). Gemini 2.5 Pro tiers the same way at the same boundary. So the headline "Gemini Pro is cheaper than Sonnet" claim flips for genuinely long-context work: above 200K tokens, Gemini 3.1 Pro input ($4) exceeds Sonnet 4.6 input ($3), which does not tier at all. OpenAI tiers as well, framed as separate long-context rates: GPT-5.5 rises from $5 input / $30 output to $10 / $45 in the long-context band, and GPT-5.5-pro goes from $30 / $180 to $60 / $270 (OpenAI pricing, verified June 11, 2026). Of the four providers, only Anthropic and DeepSeek charge one flat rate across the full window.

Caching mechanics differ more than the discounts do

All four providers discount aggressively, but the shapes are different:

Anthropic: cache reads at 0.1x base input, cache writes at 1.25x (5-minute TTL) or 2x (1-hour TTL). Batch API is a flat 50% off input and output (pricing docs, verified June 11, 2026).
OpenAI: cached input at 0.1x on most models ($0.50 on GPT-5.5's $5 input), no cached rate listed for the pro models, batch at 50% off (OpenAI pricing, verified June 11, 2026).
Google: per-token cache rate plus an hourly storage fee, batch at 50% off (Gemini pricing, verified June 11, 2026).
DeepSeek: the most aggressive cache discount on the table. V4 Flash cache hits bill at $0.0028 versus $0.14 for misses, a 0.02x multiplier (DeepSeek pricing, verified June 11, 2026).

If your workload is a chat or agent loop that resends a large stable prefix every turn, effective cost is dominated by the cache read rate, not the headline input rate. That reshuffles the table substantially in DeepSeek's and Anthropic's favor.

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.

From the archive

How to Use Claude Fable 5: Every Access Path Explained

Jun 11, 2026 • 8 min read

Is Claude Fable 5 Slow? Latency in Practice, and When It Matters

Jun 11, 2026 • 8 min read

Managing a Fleet of Claude Agents: A Practical Guide

Jun 11, 2026 • 10 min read

Migrating Off Retired GPT Models in 2026: A Working Checklist

Jun 11, 2026 • 10 min read

Head-to-Head by Tier

Matching models by intended tier rather than by name, verified June 11, 2026:

Tier	Cheapest	Mid	Premium
Max capability	-	Claude Fable 5 ($10 / $50)	GPT-5.5-pro and GPT-5.4-pro ($30 / $180)
Frontier workhorse	Gemini 3.1 Pro Preview ($2 / $12 under 200K)	Claude Opus 4.8 ($5 / $25)	GPT-5.5 ($5 / $30)
Production mid-tier	DeepSeek V4 Pro ($0.435 / $0.87)	GPT-5.4 ($2.50 / $15)	Claude Sonnet 4.6 ($3 / $15)
Fast and cheap	DeepSeek V4 Flash ($0.14 / $0.28)	GPT-5.4-nano ($0.20 / $1.25)	Claude Haiku 4.5 ($1 / $5)

The striking row is max capability. Fable 5 at $10/$50 is the cheapest model in its claimed tier by a wide margin: GPT-5.5-pro costs 3x more on input and 3.6x more on output. Whether Fable 5 and GPT-5.5-pro actually belong in the same tier is a benchmark question, not a pricing one, but the price positioning is unambiguous. There is a deeper dive in the Fable 5 cost-per-task analysis.

Decision Guide by Persona

Side projects and prototypes. DeepSeek V4 Flash ($0.14 / $0.28) or Gemini 3 Flash Preview ($0.50 / $3.00, with a free tier) are hard to argue with. GPT-5.4-nano is the cheapest input rate among the closed providers at $0.20. At these prices, model choice is a quality decision, not a budget one. The open-weights angle adds another dimension - see notes on DeepSeek's open-weights economics.

Production apps with steady traffic. The mid-tier triangle is GPT-5.4 ($2.50 / $15), Sonnet 4.6 ($3 / $15), and Gemini 3.1 Pro Preview ($2 / $12 under 200K). Identical or near-identical output rates mean the decision comes down to input volume, caching shape, and evals. If your prompts regularly exceed 200K tokens, Gemini's tier flip moves it from cheapest to most expensive of the three.

Agentic and long-horizon workloads. Output tokens dominate agent spend, and output is where the providers diverge most. Opus 4.8 ($25 output) undercuts GPT-5.5 ($30) at the frontier workhorse tier. Fable 5 at $50 output only pays off when its quality reduces retries and total tokens. Batch APIs (50% off at Anthropic, OpenAI, and Google) are the easiest structural saving for any agent pipeline that is not latency-sensitive.

Teams routing across providers. With four providers publishing OpenAI-compatible or near-compatible APIs and a 643x output price spread, routing cheap-by-default with escalation on failure is increasingly the rational architecture. The LLM router comparison covers the tooling.

Compliance-constrained teams. Residency costs extra everywhere: Anthropic applies a 1.1x multiplier for US-only inference on Opus 4.6, Sonnet 4.6, and later models, and OpenAI charges a 10% uplift for regional data-residency processing on models released on or after March 5, 2026 (both verified June 11, 2026 on the pricing pages above).

When to Skip the Switch

Cheaper sticker prices do not automatically mean cheaper bills, and there are real reasons to stay put:

Your cache is tuned. Prompt caches are provider-scoped and model-scoped. Switching providers means cold caches, re-tuned breakpoints, and a transition period at full input rates that can eat months of theoretical savings.
Your spend is output-dominated and quality-sensitive. A cheaper model that needs two attempts costs more than an expensive one that needs one. Measure completion rates per task, not dollars per token.
You depend on provider-specific features. Anthropic's 1-hour cache TTL, Gemini's explicit cache storage model, and DeepSeek's dual OpenAI- and Anthropic-format endpoints are not interchangeable. Migration cost is real engineering time.
DeepSeek's legacy endpoints are mid-deprecation. The deepseek-chat and deepseek-reasoner names retire on July 24, 2026 in favor of the V4 model IDs (verified June 11, 2026), so budget a small migration even if you stay.

If none of those apply and your evals show parity, the spread in this table is too large to ignore.

FAQ

What is the cheapest frontier LLM API in June 2026?

DeepSeek V4 Flash, at $0.14 per million input tokens (cache miss) and $0.28 per million output tokens, verified June 11, 2026 on DeepSeek's official pricing page. Cache hits drop input to $0.0028. Among US closed providers, GPT-5.4-nano has the lowest input rate at $0.20 and Gemini 3 Flash Preview has a free tier.

Does Claude charge extra for long context in 2026?

No. Claude Fable 5, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6 include the full 1M token context window at standard per-token rates, with no premium above 200K tokens (verified June 11, 2026). Gemini 3.1 Pro Preview and Gemini 2.5 Pro still charge higher rates for prompts above 200K tokens.

How much more expensive is Claude Fable 5 than GPT-5.5?

Fable 5 costs $10 input / $50 output per MTok versus GPT-5.5 at $5 / $30, so 2x on input and about 1.67x on output. Against GPT-5.5-pro ($30 / $180), Fable 5 is actually the cheaper max-capability option by 3x on input and 3.6x on output. All figures verified June 11, 2026.

Why do per-MTok prices understate Claude costs?

Claude models from Opus 4.7 onward, including Fable 5, use a new tokenizer that can produce up to 35% more tokens for the same text compared with older Claude models, per Anthropic's pricing documentation (verified June 11, 2026). Token counts measured on older models do not transfer, so per-MTok rate comparisons against pre-4.7 baselines undercount real spend.

Do all providers offer batch discounts?

Anthropic, OpenAI, and Google all offer a 50% batch discount on input and output tokens (verified June 11, 2026). DeepSeek's pricing page lists no batch tier, but its standard rates sit below the other providers' batch rates in most tiers anyway.

Sources

https://platform.claude.com/docs/en/about-claude/pricing (accessed June 11, 2026)
https://platform.claude.com/docs/en/about-claude/models/overview.md (accessed June 11, 2026)
https://developers.openai.com/api/docs/pricing (accessed June 11, 2026)
https://ai.google.dev/gemini-api/docs/pricing (accessed June 11, 2026)
https://api-docs.deepseek.com/quick_start/pricing (accessed June 11, 2026)