10 items
10 posts
AI agents are getting their own computers. Here is how to choose a sandbox architecture: filesystem isolation, network policy, secrets boundaries, snapshots, and when shell access is overkill.
Aharness, LangChain's custom harness pattern, and OpenAI's code-first migration all point to the same next step: agent processes need typed gates, validated evidence, and controlled transitions.
Oak is an early bet that AI coding agents need version control shaped around sessions, virtual workspaces, and token budgets. The idea is risky, but the pressure on Git workflows is real.
The Bayer and Thoughtworks PRINCE case study is a useful reminder that reliable agentic AI comes from context routing, traces, evals, monitoring, and human review, not from a better prompt alone.
As coding agents get easier to delegate to, the scarce resource shifts from code generation to review capacity, CI minutes, environment reliability, and merge discipline.
Hex's data-agent lab shows the practical eval pattern AI teams should copy: compare candidates against stable baselines, keep receipts, and judge changes by task behavior.
GitHub's latest agent workspace trend points at a boring but important primitive: agents need explicit filesystem contracts before they get more tools.
Anthropic's June 15 Agent SDK credit split is not just a pricing tweak. It is a signal that autonomous coding workflows need separate budgets, lanes, and receipts.
DD shipped six paid products in a single day. The thesis is simple: agent infra for small teams. $20 a month each, $50 for the bundle. Here's what we shipped, what's alpha, and what's still being wired.
Ten private tools shipped overnight - observability, skills, hooks, prompts, and evals - aimed at the agent infrastructure gap small teams keep falling into.

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.