Prompt Caching - Claude Code
Automatic reuse of cached context for substantial cost reduction.
Prompt caching reuses large, stable parts of your prompt across turns so you don't pay to re-tokenize them every time.
What it does
Claude Code marks static context - system prompts, CLAUDE.md, loaded files - as cacheable. Subsequent turns that reuse the same prefix pay a fraction of the normal per-token cost. This is why long sessions don't cost linearly more per turn as context grows.
When to use it
- Any session with meaningful CLAUDE.md or rule files - caching is already on by default.
- Heavy repos where large file reads recur turn after turn.
- Long debugging sessions where you want predictable costs.
- API-integrated workflows where per-turn cost matters.
Gotchas
- Cache hits require the prefix to be byte-identical. Small CLAUDE.md edits invalidate the cache.
- Cached entries expire - very long gaps between turns pay full price again.
- Caching is configured per model. Check the model config doc if your numbers look off.
Official docs: https://code.claude.com/docs/en/model-config.md#prompt-caching-configuration
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
Was this helpful?
Related Guides
Related Tools
Claude Code
Anthropic's agentic coding CLI. Runs in your terminal, edits files autonomously, spawns sub-agents, and maintains memory...
View ToolCodeburn
Interactive TUI dashboard that shows exactly where your Claude Code and Cursor tokens are going, in real time.
View ToolZed
High-performance code editor built in Rust with native AI integration. Sub-millisecond input latency. Built-in assistant...
View ToolCursor
AI-native code editor forked from VS Code. Composer mode rewrites multiple files at once. Tab autocomplete predicts your...
View ToolRelated Videos

Composio: Connect OpenClaw & Claude Code to 1,000+ Apps via CLI
Composio: Connect AI Agents to 1,000+ Apps via CLI (Gmail, Google Docs/Sheets, Hacker News Workflows) Check out Composio here: http://dashboard.composio.dev/?utm_source=Youtube&utm_channel=0426&utm_...

Claude Code Channels in 8 Minutes
Anthropic has released Channels for Claude Code, enabling external events (CI alerts, production errors, PR comments, Discord/Telegram messages, webhooks, cron jobs, logs, and monitoring signals) to b...

Claude Code Loops in 7 Minutes
Claude Code “Loop” Scheduling: Recurring AI Tasks in Your Session The script explains Claude Code’s new “Loop” feature (an evolution of the Ralph Wiggins technique) for running recurring prompts that...
Related Posts

Anthropic Sonnet 4.5 in Claude Code
Anthropic's Claude Sonnet 4.5 isn't just another model increment. The company claims they've observed it maintaining foc...

12 Tools in One Night: An Honest Overnight Agent Report
I told an agent to improve the site every 10 minutes and went to sleep. Here is what 12 new repos, 60 PRs, and three goo...

Agent Replays with TraceTrail: Loom for Agent Runs
Agent runs are opaque. TraceTrail turns a Claude Code JSONL into a public share link with a stepped timeline of messages...
