Home

Agents Can Act. That Doesn't Make Them Governable.

Protocols help agents connect. Nothing yet makes them accountable for the work.

Most agent systems can call tools and produce plausible logs. Far fewer can tell you what they were allowed to do, whether they actually cleared the gate, or whether they graded themselves.

The first time our agent wrote code, picked the tests, ran them, and drafted the review summary for the same change, the problem was hard to miss.

Nothing had failed in the usual sense. The diff applied. The tests passed. The summary sounded confident. But the system had created the appearance of review without any actual independence. The same loop did the work, picked the evidence, and explained the result.

That's the part a lot of current agent writing skips. MCP gives you a clean way to expose capabilities. A2A gives you a clean way to talk to opaque remote agents. Both matter. Neither answers the real question:

What is this thing allowed to do right now, and why?

That sounds abstract until the system does something expensive, visible, or irreversible. Then you want all of it: scope, environment, what has to pass before the work counts, who approved the next step, what can be undone.

The industry is building pieces. GitHub added agent session activity and provenance. Microsoft shipped an Agent Governance Toolkit with policy enforcement and inter-agent receipts. A2A just hit v1.0 with 150+ organizations. Useful moves. They still leave out the question above them: is the system still inside the task you approved?

Protocols tell systems how to connect. Runtime governance tells them whether a call may execute. The missing layer says what the work is, what evidence counts, and when the system has to stop and ask again.

Where The Gap Still Is
Protocols
MCP, A2A, tool calling, transport. Useful. They tell systems how to connect.
Action governance
Policy checks, approvals, audit, environment controls. Useful. They decide whether a call may run.
Task contract
Scope, evidence, expiry, delegation, rollback, completion. This is the missing layer.

Most Agent Governance Is Theater

Once you've seen it a few times, the pattern is easy to spot.

Eval wallpaper

Scores, dashboards, green checks everywhere. The question is simple: what did the eval actually block? If the answer is nothing, it isn't a gate.

Approval laundering

A human approved one thing, then the system kept moving after the environment changed or the scope drifted, and everyone kept acting like the original approval still covered it.

Judge collapse

The same system writes the code, picks the tests, runs them, interprets the output, and drafts the summary the reviewer sees. That's not review. That's self-certification.

Decorative governance makes you feel safer. Mechanical governance changes what the system can do.

Looks governed Actually governed
Risk label Risk class changes what the system can do
Eval dashboard Eval gate blocks completion
Human review Scoped approval expires on drift
Chat transcript Structured trace can reconstruct the run
Receipt log Receipt explains why the action was allowed

Don't ask "do you have governance?" Ask what it blocked.

What Breaks When You Build the Engine

We built a native Apple agent command center called Cascade. It dispatches work to coding agents, gates their output, and keeps a human in the loop at real boundaries. Building it made the gap between labels and mechanisms painfully clear.

The first problem we ran into was judge collapse. Cascade can produce work and evaluate work, but when the same system writes the artifact, interprets the output, and summarizes the result, it grades itself too generously. The fix isn't better prompting. It's keeping at least one evidence path the acting loop can't quietly rewrite.

We saw runs where the system patched code, picked a narrow test target that happened to pass, and wrote a review summary that read like an independent second set of eyes. It wasn't.

Cascade already has real mechanics here. Authored evaluate and human_review stages can pause work mechanically. What's still missing is a first-class contract for scope drift, approval expiry, and why a consequential transition was allowed.

We're also testing this outside coding loops. Cascade runs a mountain property intelligence service where agents research, synthesize, and draft client deliverables. First run: artifacts collapsed across stages, approval gates weren't mandatory enough, mode confusion between validation and live operations. Different domain, same failures.

Agent writes code
Agent picks tests
Agent runs tests
Agent interprets results
Agent claims success

Silent approval expansion came next. You approve one bounded thing. The system keeps moving, the environment changes, and nobody forces a fresh decision. What looked like scoped approval at grant time quietly becomes open-ended permission.

T0 Human approves deploy to staging
T1 Agent adds a dependency
T2 Scope expands to new service
T3 Still running on original approval

Missing receipts. Cascade has traces and artifacts. What it doesn't have yet is a clear record of why each significant action was allowed. We can reconstruct what happened. We can't answer cleanly why it was permitted.

Missing transition guards. Without enforcement, labels drift. "Completed" starts meaning "the run stopped talking." "Approved" starts meaning "someone said yes to something adjacent." "In scope" starts meaning "close enough." Industrial systems solved this decades ago with interlocks: blocked transitions, not warning lights. Software agent systems need the same discipline.

The Practical Minimum

Most builders don't need enterprise ceremony. They need controls sized to blast radius.

Trace your work. Not "save the chat." A chat transcript is cockpit chatter. A trace is the flight data recorder. One tells you what was said. The other tells you what happened.

Gate your completions. The run isn't done because the model stopped talking. Execution is not settlement. A completed run is not a completed decision. "Done" means the work cleared policy, survived challenge, and the human signed off.

Don't let the agent grade itself. Keep at least one evidence path the acting loop can't rewrite. External test harness, raw source material, protected baseline. The form varies. The independence can't.

That blocks a surprising amount of theater without building a fake governance stack. The farther out you go, the tighter the controls get.

Blast radius rises faster than most governance does.
Local sandbox
CI
Protected branch
Production write
External API
Money movement

That's the floor. Everything above it is architecture we haven't built yet.