model-routing Tutorials, Tools, and Guides

New

Envoy AI Gateway 1.0 Makes LLM Routing an Infrastructure Decision

8 min read

Envoy AI Gateway 1.0 is production-ready. The useful question for builders is when an Envoy-based LLM gateway beats direct SDK calls, LiteLLM, OpenRouter, or a hosted AI gateway.

AI Gateway LLM Infrastructure Model Routing Envoy Developer Tools

New

Fugu Ultra's Frontier Performance Claim, Explained Without the Hype

11 min read

Sakana says Fugu Ultra stands with Fable, Mythos, GPT-5.5, Gemini, and Opus by orchestrating models instead of being one giant model. Here is what the benchmarks show, what is novel, and what still needs proof.

ai-benchmarks ai-models model-routing ai-agents

New

Sakana Fugu and the Case for Not Betting Everything on One Proprietary Model

9 min read

Sakana Fugu makes a timely argument for model routing: frontier performance should come from swappable systems, not a hard dependency on one proprietary API.

model-routing ai-infrastructure ai-models vendor-lock-in

New

Sakana Fugu Ultra: The Model Router Making the Frontier Look Less Proprietary

10 min read

Sakana Fugu Ultra is not just another giant model. It is a learned orchestration layer that routes work across expert models, matches frontier benchmark claims, and makes a serious case for multi-model AI systems.

ai-models model-routing ai-agents open-models

New

The Router Era: Why Not Owning a Frontier Model Became an Advantage

11 min read

No single model wins every task anymore, and the companies that never trained one - Factory, Devin, Perplexity, Cursor, OpenCode - are turning that into a moat. This is how model routing works, why open weights and neoclouds make it cheap, and the honest counter-argument.

ai-models model-routing ai-coding-tools open-weights agents

New

DeepSeek V4 Economics: The Cost-Quality Frontier for Agentic Coding in 2026

9 min read

DeepSeek V4 Pro lands a 63.5 on SWE-bench Verified at $0.435/$0.87 per million tokens, and Flash runs agent inner loops for cents. Here is the worked cost math, the Flash-vs-Pro split, and a clear guide on when to route to DeepSeek instead of a frontier model.

deepseek cost-analysis agentic-coding llm-pricing model-routing open-weights

New

Factory Router, Explained: How Automatic Model Routing Cuts Coding-Agent Spend 20-25%

10 min read

Factory.ai shipped a router that auto-picks the model for each Droid session and fails over across providers. The vendor claims 20-25% lower token spend and 99.9%+ request reliability. Here is what the product actually does, which claims are vendor claims, and whether a router beats DIY routing for your team.

factory-ai model-routing orchestration coding-agents cost-optimization ai-infrastructure

New

'The Orchestration Is the Product': What Perplexity's Aravind Srinivas Sees That the Model Labs Don't

11 min read

Perplexity launched a $200-a-month agent that coordinates 19 models and calls orchestration, not the model, the product. Here is the strategic case for why the durable, defensible layer in AI sits next to the labs, not inside them - and what 'token value per watt per user' actually means for builders.

Perplexity Model Orchestration AI Agents AI Strategy Model Routing

New

OpenRouter Fusion Makes Model Panels Real. Use Them Like Escalation, Not Autopilot

8 min read

OpenRouter Fusion turns multi-model panels into an API feature. The useful lesson is not to run every prompt through more models. It is to define when a task deserves an expensive second opinion.

OpenRouter AI Models Model Routing Developer Tools AI Infrastructure

Factory AI and the Model Routing Era: How Coding Agents Are Learning to Spend Your Tokens Wisely

8 min read

Factory AI's Droid agent surfaces a new competitive front in coding tools: cost-per-completed-task. Here's what their architecture reveals about where the whole industry is heading.

factory-ai coding-agents model-routing droid developer-tools cost-optimization ai-infrastructure

What the 'Notes on DeepSeek' Essay Gets Right About Open-Weights Economics

7 min read

A first-hand visit to DeepSeek HQ reveals something more interesting than benchmark scores: a 300-person company that treats AI as infrastructure, not eschatology - and what that means for API pricing everywhere.

deepseek open-weights ai-economics model-routing developer-tools china-ai

OpenRouter in 2026: Review, Setup, and When Model Routing Pays

8 min read

OpenRouter gives you one API key for 300+ models, automatic fallbacks, and intelligent provider routing. Here is what it actually costs, how to set it up in five minutes, and when you should skip it entirely.

ai-tools api model-routing developer-tools llm

MODEL-ROUTING

AI's Affordability Crisis Is Really an Agent Cost Accounting Problem

Envoy AI Gateway 1.0 Makes LLM Routing an Infrastructure Decision

Fugu Ultra's Frontier Performance Claim, Explained Without the Hype

Sakana Fugu and the Case for Not Betting Everything on One Proprietary Model

Sakana Fugu Ultra: The Model Router Making the Frontier Look Less Proprietary

The Router Era: Why Not Owning a Frontier Model Became an Advantage

DeepSeek V4 Economics: The Cost-Quality Frontier for Agentic Coding in 2026

Factory Router, Explained: How Automatic Model Routing Cuts Coding-Agent Spend 20-25%

'The Orchestration Is the Product': What Perplexity's Aravind Srinivas Sees That the Model Labs Don't

OpenRouter Fusion Makes Model Panels Real. Use Them Like Escalation, Not Autopilot

Factory AI and the Model Routing Era: How Coding Agents Are Learning to Spend Your Tokens Wisely

What the 'Notes on DeepSeek' Essay Gets Right About Open-Weights Economics

OpenRouter in 2026: Review, Setup, and When Model Routing Pays

Keep exploring model-routing

Get Smarter About AI Dev

MODEL-ROUTING

AI's Affordability Crisis Is Really an Agent Cost Accounting Problem

Envoy AI Gateway 1.0 Makes LLM Routing an Infrastructure Decision

Fugu Ultra's Frontier Performance Claim, Explained Without the Hype

Sakana Fugu and the Case for Not Betting Everything on One Proprietary Model

Sakana Fugu Ultra: The Model Router Making the Frontier Look Less Proprietary

The Router Era: Why Not Owning a Frontier Model Became an Advantage

DeepSeek V4 Economics: The Cost-Quality Frontier for Agentic Coding in 2026

Factory Router, Explained: How Automatic Model Routing Cuts Coding-Agent Spend 20-25%

'The Orchestration Is the Product': What Perplexity's Aravind Srinivas Sees That the Model Labs Don't

OpenRouter Fusion Makes Model Panels Real. Use Them Like Escalation, Not Autopilot

Factory AI and the Model Routing Era: How Coding Agents Are Learning to Spend Your Tokens Wisely

What the 'Notes on DeepSeek' Essay Gets Right About Open-Weights Economics

OpenRouter in 2026: Review, Setup, and When Model Routing Pays

Keep exploring model-routing

Get Smarter About AI Dev