Jun 15 - Jun 22, 2026
50 new pieces of content published this week.
As coding agents get easier to delegate to, the scarce resource shifts from code generation to review capacity, CI minutes, environment reliability, and merge discipline.
Codex can point at OpenAI-compatible model providers, local Ollama servers, and internal model proxies. Here is the practical config pattern, the sharp edges, and when to use it.
Hex's data-agent lab shows the practical eval pattern AI teams should copy: compare candidates against stable baselines, keep receipts, and judge changes by task behavior.
Dan Abramov's explainer on ATProto architecture is making the rounds. The core insight: Bluesky's protocol separates hosting from applications in a way that Mastodon-style federation fundamentally cannot. Here's what that means for developers.
Cloudflare shipped wrangler deploy --temporary on June 19, 2026. AI agents can now deploy Workers, D1 databases, and KV stores without browser auth flows. Here is how it works.
The new wrangler deploy --temporary flag creates ephemeral Cloudflare accounts for AI agents. 60-minute deployments, no OAuth, no browser - just deploy and claim later.
GLM-5.2 ships under an MIT license, so it is hosted everywhere - and a few places run it for free or nearly free right now. Here is every way to access Z.ai's open-weights coding model, from OpenCode Go referral credits and Devin to the cheapest per-token routes on OpenRouter, Fireworks, and DeepInfra, plus local Ollama.
New benchmark data shows GPT-5.5 hallucinates 86% of the time when it does not know the answer - versus 28% for the open-weights GLM-5.2. The numbers challenge the assumption that bigger models equal more reliable output.
Modern LLMs now use MoE routing, mixed attention variants, and fused vision encoders. The simple transformer stack is gone - here's what replaced it and why it matters for developers.
Goal, loop, routine. Three verbs, two tools, one hard part. A complete field guide to running agentic loops in Claude Code and Codex, the real commands, the patterns people actually run, and the two failure modes that burn money.
No single model wins every task anymore, and the companies that never trained one - Factory, Devin, Perplexity, Cursor, OpenCode - are turning that into a moat. This is how model routing works, why open weights and neoclouds make it cheap, and the honest counter-argument.
A deep dive into DuckDB's architecture - columnar storage, vectorized execution, and zero-copy design that lets it compete with million-dollar clusters on a laptop.
Most developers only know .gitignore, but Git offers two other ignore mechanisms for local workflows and machine-wide patterns. Here's when to use each.
GitHub's Agent Finder discovers and invokes Claude, Codex, MCP servers, and skills automatically. Here is how the new ARD specification changes AI coding tool integration.
The MCP 2026-07-28 release candidate drops sessions entirely. Here is what changes, what breaks, and how to migrate your MCP servers before the July 28 deadline.
MCP's new enterprise-managed authorization flow is not just less login friction. It moves agent tool access into identity, policy, and audit systems enterprises already understand.
Java's most anticipated performance feature is finally landing. Value classes eliminate object identity overhead and enable dense memory layouts - here's what changes.
MCP's new Enterprise-Managed Authorization removes per-user OAuth friction. Anthropic, Okta, Figma, and Linear ship centralized auth for AI agent tooling.
A YC W25 startup open-sources CADAM, a browser-based tool that converts natural language to parametric OpenSCAD models. HN debate: is text-to-CAD genuinely useful or just another demo?
Auto-installing tree-sitter grammars, built-in markdown mode, window layout commands, and more - the upcoming Emacs release absorbs features that used to require external packages.
Alex Ellis shares real production experience running local LLMs: $12k hardware investment, 2-3 month ROI, and why treating local models as Opus substitutes misses the point entirely.
JetBrains released Mellum2 on June 2, 2026 - a 12B MoE model with only 2.5B active parameters per token. Here is how to run it locally, when to use it, and where it fits in your AI coding stack.
Midjourney, the company that makes AI pictures, just announced a full-body ultrasonic scanner and a spa chain to put it in. It sounds like a non sequitur. It is not. Here is what was actually announced, why a generative-image lab is suddenly building medical hardware, and the sharpest skeptic and believer takes from Hacker News on whether any of it survives contact with the FDA.
The Transformer co-creator leaves Google DeepMind for OpenAI just two years after Google paid $2.7 billion to bring him back from Character.AI.
A $500M accidental Claude bill and an open-weights model beating GPT-5.5 at one-sixth the cost point to the same conclusion: the margin is moving to the layer that decides when to use which model for what. Here is how routing and orchestration differ, and how to cut your model spend.
A hands-on, beginner-friendly walkthrough of building an AI agent with Vercel eve: scaffold the project, define an agent and a typed tool with defineTool, run it locally, call it through the durable session and stream API, and deploy to Vercel Functions.
Stop the approval-fatigue prompts without going full YOLO mode. A hands-on guide to Claude Code's permission system - settings.json scopes, allow/deny/ask rules, tool specifiers, and the headless flags that actually matter.
A company accidentally spent $500M on Claude in one month. Uber torched its whole 2026 AI budget by April. The fix is not less AI - it is guardrails. Here is the playbook: caps, alerts, gateway spend limits, model routing, prompt caching, and approval workflows.
Cohere shipped its first developer-facing model on June 9, 2026. North Mini Code is a 30B mixture-of-experts coding model with 3B active parameters, Apache 2.0 weights, and a deployment footprint of a single H100. Here is what it actually offers and where the open questions are.
At its Compile conference, Cursor announced Origin: a Git-compatible code hosting platform designed around AI agents as first-class users. Built on its Graphite acquisition, it promises agent-driven merge conflict resolution, stacked PRs, and MCP-extensible automation. Here is what was actually announced, what is still a waitlist promise, and why it matters for developers.
DeepSeek V4 Pro lands a 63.5 on SWE-bench Verified at $0.435/$0.87 per million tokens, and Flash runs agent inner loops for cents. Here is the worked cost math, the Flash-vs-Pro split, and a clear guide on when to route to DeepSeek instead of a frontier model.
Epic Games open-sourced Lore, a centralized version control system designed for binary-heavy game projects. It uses Merkle trees, on-demand file hydration, and native chunked storage to handle terabyte-scale repos that Git struggles with.
At Vercel Ship 26 in London on June 17, 2026, Vercel shipped a wave of agent-era tooling: the open-source eve agent framework, Vercel Drop for drag-and-drop deploys with no Git or CLI, spend caps for AI Gateway API keys, and the HarnessAgent API in AI SDK 7 that unifies Claude Code, Codex, and Pi behind one interface.
Factory.ai shipped a router that auto-picks the model for each Droid session and fails over across providers. The vendor claims 20-25% lower token spend and 99.9%+ request reliability. Here is what the product actually does, which claims are vendor claims, and whether a router beats DIY routing for your team.
Gemini CLI stops working June 18, 2026. Here is exactly what to do: install Antigravity CLI, migrate your config, update your scripts, and avoid the silent MCP failure that breaks tool calls.
On June 2, 2026, GitHub made the Copilot SDK generally available. It exposes the same agent runtime behind Copilot - planning, tool calls, file edits, streaming, MCP - across TypeScript, Python, Go, .NET, Rust, and Java. Here is what changed at GA and what it means for builders.
Z.ai's GLM-5.2 lands as a 753B open-weights coding model that beats GPT-5.5 on SWE-bench Pro for roughly one-sixth the per-token cost. Here is the real cost math, a worked cost-per-task example, and a when-to-use-which decision guide.
A data-rich, source-cited comparison of the three open-weights coding models that matter in 2026: GLM-5.2, DeepSeek V4, and Qwen3. Benchmark table, per-token pricing, context windows, self-host footprint, and a clear pick-X-if decision matrix.
On June 17, 2026, attackers hijacked a dormant Mastra contributor account and pushed malicious versions of 140+ packages. The payload steals crypto wallets, browser data, and cloud credentials. Here is what happened, how to check your lockfile, and what to do if you installed an affected version.
On June 16, 2026, Microsoft's Work IQ APIs reach general availability - a workplace intelligence layer that hands agents pre-assembled, permission-trimmed Microsoft 365 context instead of raw Graph calls. Here is what the four domains, three protocols, and consumption pricing mean for developers building enterprise agents.
A code-heavy field guide to model routing. Real, runnable-style configs for tiering tasks by complexity, routing simple work to open-weights, reserving frontier models for hard reasoning, building failover chains, and keeping prompt caches warm with OpenRouter, LiteLLM, and Factory Router.
Databricks open-sourced Omnigent, a meta-harness that sits above individual agent CLIs so your sessions, policies, and skills are not locked inside any single tool. Here is what it does, how to install it, and where it fits if you already run Claude Code and Codex.
OpenAI's mid-June 2026 Codex drop brings Computer Use to the EEA, UK, and Switzerland and adds selective Claude Code imports plus managed Bedrock auth to the CLI. Here is what actually shipped, verified against the changelog.
Perplexity launched a $200-a-month agent that coordinates 19 models and calls orchestration, not the model, the product. Here is the strategic case for why the durable, defensible layer in AI sits next to the labs, not inside them - and what 'token value per watt per user' actually means for builders.
The IETF published RFC 10008 defining a new HTTP QUERY method - GET with a request body. It is safe, idempotent, cacheable, and solves the longstanding problem of complex queries hitting URL length limits.
Open weights are free to download, but inference is not free to run. Here is the honest break-even math on when self-hosting GLM-5.2, DeepSeek V4, or Llama beats paying per-token API prices - GPU rental and ownership costs, real throughput, utilization, the crossover in tokens per month, and the hidden ops bill nobody budgets for.
Vercel launched eve at Ship 26, an open-source agent framework it calls Next.js for agents. You define each agent as files under an agent/ directory, and eve compiles it into a production app on Vercel Functions with durable execution, sandboxes, approvals, subagents, and evals built in.
Cursor Automations lets AI agents run in the background based on triggers, not prompts. Here is how to set them up, configure triggers, and integrate into your workflow.
Z.ai shipped GLM-5.2 in mid-June with a usable 1M-token context window, two thinking-effort levels, and MIT open weights now released. Here is the setup guide for Claude Code, pricing breakdown, and what to test before the benchmarks arrive.
OpenRouter Fusion turns multi-model panels into an API feature. The useful lesson is not to run every prompt through more models. It is to define when a task deserves an expensive second opinion.
Every week: new articles, tool reviews, and technical deep dives on AI agents and coding tools. One email. No spam.