Skip to content
Blog

Insights on production AI.

Practical knowledge from the field. No hype, no fluff, just lessons learned from shipping enterprise AI systems.

Case studies
Automation8 min read

Better Reasoning, Worse Tool Use: The Hidden Tradeoff in Capable Agents

Reinforcement learning that sharpens an LLM's reasoning also raises tool hallucination proportionally. That's a causal finding, peer-reviewed at ACL 2026. Here's the architectural fix and what to add to your eval harness today.

June 9, 2026
Governance9 min read

Audit logs for AI: the contract that survives a compliance review

Six months after a bad inference, a regulator will ask what model decided, on which input, on whose behalf, with what downstream effect. Most teams cannot answer. Here is the schema we ship on every regulated build.

May 13, 2026
AI Ops10 min read

Trust is a UX layer, not a model property

The cleanest data point on AI trust failure is from a study where the model never failed. 758 consultants, GPT-4, a poisoned task: 84% right without it, 60 to 70% right with it. The surface owns trust. Here are the four design moves we ship on every build.

May 13, 2026
AI Ops10 min read

The production AI checklist for 2026: from demo to deployment

Most AI projects die between the POC demo and the on-call rotation. Not because the model is wrong, but because the operating discipline around it was never built. Here is the checklist DAD applies before we call a build shippable.

May 13, 2026
Automation10 min read

Five RAG failure modes we still find in 2026 audits

RAG is a mature pattern. The systems we audit are not. Five failure modes keep showing up at companies that already shipped a v1, and each has a symptom, a fix, and a cheap measurement.

May 13, 2026
AI Ops9 min read

Your agent passes the benchmark. It will fail in production.

Pass@1 is a single coin flip dressed up as an SLA. Early-2026 research puts numbers on the gap, and on what to instrument instead before you ship.

May 12, 2026

Stay updated

Field notes from the studio. Monthly, signal only — engineering decisions on real projects.

A quick check, then you're in