Why Hierarchical Swarms Beat Flat

May 23, 2026Antonio Nusco5 min read

architecture
swarm

Throw five agents at a non-trivial task and tell them to "collaborate," and you get one of two failure modes. Either they all read the same giant context window and step on each other's edits, or they fan out blindly and produce work that doesn't compose. Flat multi-agent setups look impressive in a demo and fall apart on a real repository.

The reason is structural, not a prompt you can tune your way out of. So SwarmADE doesn't run a flat pool. It runs a hierarchy: a Queen that plans, Workers that execute, and Drones that audit. This post is about why that shape wins.

The flat-swarm tax

A flat swarm shares state by sharing context. Every agent needs to know what every other agent is doing, so either you broadcast everything to everyone — and burn tokens linearly with the number of agents while each one drowns in irrelevant detail — or you don't, and they make conflicting assumptions.

Concretely, here's what "five agents, one task, no hierarchy" tends to produce:

Two of them independently refactor the same module in incompatible ways.
A third writes tests against an interface a fourth just deleted.
Nobody owns the integration, so the final merge is a coin flip.

The cost isn't just wasted tokens. It's that no agent holds the whole plan, so there's no place where "does this still make sense?" gets asked. Coordination that isn't anyone's job doesn't happen.

The hierarchy

SwarmADE assigns the three jobs a flat swarm leaves homeless — planning, execution, and review — to three distinct roles.

The Queen is the orchestrator. She reads your spec, decomposes it into a directed acyclic graph (DAG) of tasks, and decides the order. She is the only component that holds the whole plan. Crucially, she does not write code.

Workers are your local CLI agents — Claude Code, Codex, Cursor, Aider, opencode, Cline, Goose — running as subprocesses on your machine. Each Worker gets exactly one task plus exactly the context that task needs. A Worker never sees the whole plan, which is the point: a bounded task with bounded context is one an agent can actually finish.

Drones are server-side auditors. When a Worker reports a task done, the relevant Drones review the result — correctness, security, performance, migration safety — before anything is accepted. Review is a separate role so that "is this good?" is never answered by the same agent that's motivated to say yes.

A DAG, not a conversation

Because the Queen plans up front, the unit of work is a graph, not a chat log. A small feature decomposes to something like this:

spec: "add password reset"
│
├─ task A: schema + migration (reset_tokens table)
├─ task B: POST /v1/auth/reset/request   ── depends on A
├─ task C: POST /v1/auth/reset/confirm   ── depends on A
└─ task D: email template + send          ── depends on B

The DAG makes parallelism explicit and safe. Tasks B and C share a dependency (A) but not each other, so once A lands they run concurrently on two Workers. Task D waits for B. Nobody guesses; the edges say who waits for whom.

This is also what makes the work composable. Each task has a declared contract — inputs, outputs, the files it may touch. Two Workers can't both "own" the schema because only task A is allowed to write the migration. The flat-swarm collision modes simply can't occur, because the plan forbids them.

The replanning loop

Plans are wrong sometimes — a Worker discovers the migration needs an index the spec didn't mention, or a Drone rejects an approach outright. A static plan would be brittle here. The Queen's plan isn't static.

Worker finishes task ─▶ Drones review
                          │
              ┌───────────┴───────────┐
            accept                   reject
              │                        │
        mark task done          Queen replans:
              │                  - retarget the task, or
        unblock dependents       - insert a new task, or
                                  - escalate to you

When a Drone rejects a result or a Worker reports it's blocked, control goes back to the Queen. She can retry the task with the new information, insert a task the plan was missing, reorder what's left, or — when it's genuinely a judgment call — escalate to you. The graph gets edited; execution continues. This is the loop a flat swarm has no home for, because in a flat swarm there's no planner to return to.

So what

Hierarchy isn't bureaucracy here — it's how you keep each agent's job small enough to do well while still shipping a whole feature. The Queen holds the plan so no Worker has to. Workers hold one task so the context stays sharp. Drones hold the review so "done" means something.

If you've watched a flat agent swarm talk itself in circles, that's the tax of asking every agent to plan, execute, and self-review at once. SwarmADE splits the three jobs up on purpose.

See the model end to end in How orchestration works.
Or skip ahead and start your first Hive.

Spec in. Swarm out. Queen-orchestrated.