danielhuber.dev@proton.me Sunday, April 5, 2026

Governed Autonomy: Dual-Layer Architecture for Safe, Composable Agent Systems

How a supervisor-worker hierarchy combined with stateful skill graphs and human-in-the-loop checkpoints produces agents that are both flexible and trustworthy in production.


Building production AI agents often forces a tradeoff: give the agent broad autonomy and it becomes unpredictable; constrain it heavily and it loses the flexibility that makes LLMs useful in the first place. A dual-layer architecture resolves this tension by separating governance from execution into two independently reasoned layers — one that controls who can do what, and one that controls how multi-step work is composed and checkpointed.

The Core Problem: Autonomy Without Accountability

When a single LLM agent is handed a large tool catalog and a complex goal, several failure modes emerge in practice. The agent may invoke tools in sequences that are individually valid but collectively unsafe. It may escalate privileges by chaining tool calls that circumvent intended access boundaries. And because the work is one long undifferentiated trace, it is difficult to inspect, pause, or recover from midway through.

These aren’t model quality problems — they’re architectural ones. The agent has no structural concept of roles, no checkpoints at meaningful task boundaries, and no mechanism for a human to inspect state before a consequential action is taken. Adding more prompt instructions helps at the margins but doesn’t provide enforceable boundaries.

Layer A: The Governed Supervisor-Worker Hierarchy

The first layer addresses who can do what. Rather than a flat agent with access to all tools, the architecture splits responsibilities across a supervisor agent and a set of specialized worker agents, each with a strictly scoped tool set.

The supervisor receives the high-level goal and decomposes it into sub-tasks. It does not execute tools directly — it delegates. Each worker agent owns a narrow capability domain: one handles retrieval, another handles computation, a third handles external API calls. Workers cannot access tools outside their assigned scope, and they cannot communicate laterally with each other except through the supervisor.

This role-based tool isolation produces two engineering benefits. First, it makes blast radius predictable: a misbehaving retrieval worker cannot inadvertently trigger a write operation that lives in a different worker’s scope. Second, it makes traces interpretable — you can read a supervisor’s delegation decisions as a structured audit log of intent.

┌─────────────────────────────────────────┐
│              Supervisor Agent           │
│   (goal decomposition, delegation,      │
│    audit log, no direct tool access)    │
└──────────┬──────────┬───────────────────┘
           │          │           │
    ┌──────▼──┐ ┌─────▼───┐ ┌────▼──────┐
    │ Worker A│ │ Worker B│ │ Worker C  │
    │Retrieval│ │Compute  │ │External   │
    │ Tools   │ │ Tools   │ │API Tools  │
    └─────────┘ └─────────┘ └───────────┘
         Role-based tool isolation
         Workers cannot cross tool boundaries
         No lateral worker-to-worker calls

Layer B: Stateful Skill Graphs with Checkpoints

The second layer addresses how work is sequenced. Complex tasks are modeled not as open-ended agent loops but as directed skill graphs: nodes are discrete, composable skills; edges are valid transitions between them; and certain edges carry human-in-the-loop checkpoints before execution continues.

A skill in this context is a reusable, tested unit of capability — similar to a function with a contract. Skills declare their inputs, outputs, and side effects. The graph structure means that the full execution plan is visible before work begins, partial progress is recoverable, and the system knows unambiguously which skill is executing at any moment.

Checkpoints are placed at edges that precede high-consequence or irreversible actions. When execution reaches a checkpoint, the agent pauses and surfaces its current state — what it has done, what it plans to do next, and what artifacts it has produced — for human review. The human can approve, modify the plan, or halt. This is fundamentally different from an end-of-task review: the human intervenes within the workflow, before commitment, not after.

# Conceptual skill graph definition
skill_graph = SkillGraph(
    nodes=[
        Skill("retrieve_context", inputs=["query"], outputs=["documents"]),
        Skill("score_candidates", inputs=["documents"], outputs=["ranked_list"]),
        Skill("generate_report", inputs=["ranked_list"], outputs=["report"]),
        Skill("submit_for_review", inputs=["report"], outputs=["ticket_id"]),
    ],
    edges=[
        Edge("retrieve_context", "score_candidates"),
        Edge("score_candidates", "generate_report"),
        # Checkpoint before any external submission
        Edge("generate_report", "submit_for_review", checkpoint=HumanApproval(
            prompt="Review the generated report before external submission."
        )),
    ]
)
Note

Checkpoints compound with the supervisor-worker split: the human reviewer sees not just the artifact but the full delegation trace — which worker produced what, which tools were called, and in what order. This makes review actionable rather than opaque.

Composability and Reuse Across Pipelines

Skill graphs become significantly more valuable when skills are reusable across different pipelines. A retrieval skill defined once can appear as a node in a research workflow, a validation workflow, and a reporting workflow. This mirrors how software engineers think about library functions — write once, test once, reuse everywhere — and it brings the same benefits: regression tests can target individual skills, skill-level metrics can be tracked independently, and failures can be localized without re-running the full pipeline.

Composability also makes the governance layer easier to maintain. If a new tool needs to be added to the system, you add it to the appropriate worker’s scope and define the skill that wraps it. The supervisor’s delegation logic doesn’t change; the graph simply has a new node available as a potential step.

Tip

Design skills around outputs, not tools. A skill called retrieve_context that can be backed by vector search today and a hybrid reranker tomorrow is far more composable than a skill called call_pinecone. The tool binding is an implementation detail; the skill contract is the interface.

Engineering Implications for Production Systems

The dual-layer pattern imposes upfront design cost — you need to define worker roles, tool scopes, skill contracts, and checkpoint placement before running anything. For exploratory prototypes this overhead isn’t worth it. But for any agent workflow that will handle sensitive data, trigger external side effects, or need to be audited, this structure pays for itself quickly.

A few practical notes for implementation:

  • Start with checkpoints, add automation later. It is much easier to remove a checkpoint as confidence grows than to add one after a workflow has already caused a problem.
  • Log supervisor delegation decisions as structured events. The delegation trace is your primary debugging artifact; treat it with the same care as application logs.
  • Version skill graphs explicitly. Because skills are composable, a change to one skill can affect multiple downstream pipelines. Semantic versioning on skill contracts enables safe iteration.
  • Test skills in isolation before integrating them into graphs. Unit-testable skills are the foundation of a reliable graph — if a node’s behavior is undefined in isolation, the graph behavior will be undefined in composition.

The governed autonomy pattern doesn’t eliminate the need for capable models. But it ensures that capability is channeled through structure — making agent behavior predictable enough to operate, audit, and improve in production.

Tags: researchmulti-agentsafetytool-useorchestrationskill-graphshuman-in-the-loop

This article is an AI-generated summary. Read the original paper: Mozi: Governed Autonomy for Drug Discovery LLM Agents .