Home

Agents Can Act. That Doesn't Make Them Governable.

Protocols help agents connect. Nothing yet makes them accountable for the work.

Most agent systems can call tools and produce plausible logs. Far fewer can tell you what they were allowed to do, whether they actually cleared the gate, or whether they graded themselves.

The first time our agent wrote code, picked the tests, ran them, and drafted the review summary for the same change, the problem was hard to miss.

Nothing had failed in the usual sense. The diff applied. The tests passed. The summary sounded confident. But the system had created the appearance of review without any actual independence. The same loop did the work, picked the evidence, and explained the result.

That's the part a lot of current agent writing skips. MCP gives you a clean way to expose capabilities. A2A gives you a clean way to talk to opaque remote agents. Both matter. Neither answers the real question:

What is this thing allowed to do right now, and why?

That sounds abstract until the system does something expensive, visible, or irreversible. Then you want all of it: What scope? Which environment? What has to pass before the work counts? Who approved the next step? What can be undone?

The industry is already building pieces of it. GitHub is adding agent session activity, live status, and better provenance. Microsoft shipped an Agent Governance Toolkit with policy enforcement, identity, approvals, and inter-agent receipts. Those are useful moves. They solve real problems.

They still leave out the question above them: is the system still inside the task you approved? Did the approval expire when scope changed? Does "done" still mean what it meant two steps ago?

Protocols tell systems how to connect. Runtime governance tells them whether a call may execute. The missing layer says what the work is, what evidence counts, what can change, and when the system has to stop and ask again.

Where The Gap Still Is
Protocols
MCP, A2A, tool calling, transport. Useful. They tell systems how to connect.
Action governance
Policy checks, approvals, audit, environment controls. Useful. They decide whether a call may run.
Task contract
Scope, evidence, expiry, delegation, rollback, completion. This is the missing layer.

Most Agent Governance Is Theater

Once you've seen it a few times, the pattern is easy to spot.

Eval wallpaper

Scores, dashboards, green checks everywhere. The question is simple: what did the eval actually block? If the answer is nothing, it isn't a gate.

Approval laundering

A human approved one thing, then the system kept moving after the environment changed or the scope drifted, and everyone kept acting like the original approval still covered it.

Judge collapse

The same system writes the code, picks the tests, runs them, interprets the output, and drafts the summary the reviewer sees. That's not review. That's self-certification.

Everyone already knows why this is bad in other contexts. Public companies can't audit their own books. The person who writes the check doesn't sign the check. Agent systems keep rediscovering the exact failure those controls were built to prevent.

The distinction is simple: decorative governance makes you feel safer. Mechanical governance changes what the system can do.

Looks governed Actually governed
Risk label Risk class changes what the system can do
Eval dashboard Eval gate blocks completion
Human review Scoped approval expires on drift
Chat transcript Structured trace can reconstruct the run
Receipt log Receipt explains why the action was allowed

You see the same pattern all over the place. In coding loops, "the tests passed" when the final artifact isn't the one that got tested. In research loops, "it has citations" when the strongest claim outruns the evidence. In approval workflows, "a human approved it" when the human only saw a model-shaped summary.

Don't ask "do you have governance?" Ask what changed behavior.

Did the risk class change what the system could do?
Did the eval block unsafe completion?
Did the approval expire when the scope changed?
Did the trace let you reconstruct the consequential path?
Did the receipt explain why the transition was allowed?

If the answer is no, you don't have governance. You have the language of governance.

What Breaks When You Build the Engine

We built a native Apple agent command center called Cascade. It dispatches work to coding agents, gates their output, and keeps a human in the loop at real boundaries. Building it made the gap between labels and mechanisms painfully clear.

The first problem we ran into was judge collapse. Cascade can produce work and evaluate work, but when the same system writes the artifact, interprets the output, and summarizes the result, it grades itself too generously. The fix isn't better prompting. It's keeping at least one evidence path the acting loop can't quietly rewrite.

We saw runs where the system patched code, picked a narrow test target that happened to pass, and then wrote a review summary that read like an independent second set of eyes. It wasn't.

Cascade already has real mechanics here. Authored evaluate and human_review stages can pause work mechanically. What's still missing is a first-class contract for scope drift, approval expiry, and why a consequential transition was allowed.

The same gaps show up outside coding loops. We're testing real business workflows in Cascade, including a mountain property intelligence service where agents research, synthesize, and draft client deliverables. The first run surfaced artifacts collapsing across stages, approval gates that weren't mandatory enough, and mode confusion between validation runs and live operations. Different domain, same governance failures.

Agent writes code
Agent picks tests
Agent runs tests
Agent interprets results
Agent claims success

An external check breaks the loop. Without one, the agent is grading its own homework.

Silent approval expansion came next. You approve one bounded thing. The system keeps moving, the environment changes, and nobody forces a fresh decision. What looked like scoped approval at grant time quietly becomes open-ended permission.

T0 Human approves deploy to staging
T1 Agent adds a dependency
T2 Scope expands to new service
T3 Still running on original approval

No guard forced a fresh decision. The approval stretched silently.

Missing receipts. Cascade has traces and artifacts. What it doesn't have yet is a clear record of why each significant action was allowed. We can usually reconstruct what happened. We still can't answer cleanly why it was permitted.

Missing transition guards. Systems drift into bad states because nothing enforces what the labels actually mean. "Completed" starts meaning "the run stopped talking." "Approved" starts meaning "someone said yes to something adjacent." "In scope" starts meaning "close enough."

Industrial systems solved this with interlocks. A microwave won't run with the door open. A press won't cycle if the guard isn't down. The interlock doesn't care if the operator is competent or careless. It blocks the unsafe state. Software needs the same discipline. Not a warning light. Not a policy document. A blocked transition.

The Practical Minimum

Most founders don't need enterprise ceremony. They need controls sized to blast radius.

If you're running a local coding loop, the question isn't "can I prove this to an auditor?" It's "did the system stay in bounds, did it pass the checks, and can I undo the damage?"

Trace your work. Not "save the chat." A chat transcript is cockpit chatter. A trace is the flight data recorder. One tells you what was said. The other tells you what happened.

Gate your completions. The run isn't done because the model stopped talking. Execution is not settlement. A completed run is not a completed decision. "Done" means the work cleared policy, survived challenge, and the human signed off.

Don't let the agent grade itself. Keep at least one evidence path the acting loop can't rewrite. External test harness, raw source material, protected baseline. The form varies. The independence can't.

For most builders, that's enough to block a surprising amount of agent theater without building a fake governance stack.

Local sandbox, CI, protected branch, production write, external API, money movement. The farther out you go, the tighter the controls get. Inner rings don't need outer-ring ceremony. They do need legibility, real completion gates, and an undo path. Move into customer-visible or irreversible territory and the bar goes up fast.

Blast radius rises faster than most governance does.
Local sandbox
CI
Protected branch
Production write
External API
Money movement

Trace the work. Gate the completions. Keep at least one evidence path the acting loop can't rewrite.

That's the floor. Everything above it is architecture we haven't built yet.