Agentic Patterns: Orchestrator vs Choreography

A few weeks after the article-processing pipeline from Determinism is a feature had been running cleanly, I tried to extend it. The summariser would hand off to a fact-checker when the article had numbers, the fact-checker would consult a citation-finder, and a separate tagger would listen to whatever the chain produced and emit topic labels. Three agents. Then five. Then the part of every weekend where I would try to figure out which one had introduced the bad citation.

The state machine from the original post was still there. What had changed was that each state had grown a small mesh of agents inside it, talking to each other freely. Debugging a failed run felt like untangling Christmas lights in the dark.

That is the point in a multi-agent system where you have to make a choice you will live with. Either there is a coordinator at the top deciding who runs and when () orchestration) or the agents listen for each other's events and react (choreography). Most teams get to multi-agent systems by accident, not by choosing. The accident is almost always choreography.

Why choreography breaks at scale

Coordination cost does not grow with the number of agents. It grows with the number of interactions. With n peers in a free mesh that is n(n−1)/2, quadratic and not linear. Going from one agent to five does not multiply your problem space by five. It multiplies it by twenty-five.

Microservices people figured this out fifteen years ago. The choreography variant of the saga pattern (services emitting events, others listening) is elegant in an RFC proposal and miserable in an outage. You cannot answer "what is the system doing right now" without joining logs across every service. The orchestrated saga keeps a single state machine in one place, and you can point at it.

Agents make the choreography case worse. Microservices fail loudly, with stack traces. Agents fail plausibly. A fact-checker that receives a hallucinated citation from the summariser will confidently amplify it. The tagger will pick up the amplified version and tag it as confirmed. By the time anything resembling an orchestrator sees the output, the false belief has been laundered through three agents and looks like consensus.

The two patterns, side by side

The thinnest useful version of each. Both call the same set of tools (summarise, factCheck, tag) and the only thing that changes is who decides what runs next.

// Choreography. Agents publish events; the bus routes them.
type AgentEvent =
  | { kind: "ARTICLE_FETCHED"; ctx: Ctx }
  | { kind: "SUMMARISED"; ctx: Ctx }
  | { kind: "FACT_CHECKED"; ctx: Ctx }
  | { kind: "TAGGED"; ctx: Ctx };

bus.on("ARTICLE_FETCHED", async ({ ctx }) =>
  bus.emit({ kind: "SUMMARISED",   ctx: await summarise(ctx) }));
bus.on("SUMMARISED",      async ({ ctx }) =>
  bus.emit({ kind: "FACT_CHECKED", ctx: await factCheck(ctx) }));
bus.on("SUMMARISED",      async ({ ctx }) =>
  bus.emit({ kind: "TAGGED",       ctx: await tag(ctx) }));
bus.on("FACT_CHECKED",    async ({ ctx }) => persist(ctx));
bus.on("TAGGED",          async ({ ctx }) => persist(ctx));

Two listeners on SUMMARISED. One race on persist. One un-knowable answer to "where in the run was the bad citation introduced".

// Orchestration. One agent decides the order; everyone else is a function.
async function run(article: Article): Promise<Ctx> {
  let ctx: Ctx = { article };
  ctx = await summarise(ctx);
  const [checked, tagged] = await Promise.all([
    factCheck(ctx),
    tag(ctx),
  ]);
  return merge(ctx, checked, tagged);
}

The orchestrator is barely smarter than the function call graph. That is precisely the point. The "agent" part of the system can be as wild as it likes inside summarise, factCheck, and tag. The coordination is deterministic, single-threaded, and traceable. One run, one trace, one log line per step.

The 15x token tax

There is a real cost to running an orchestrator. A multi-agent system with a lead and a handful of specialists tends to consume roughly an order of magnitude more tokens than a single-agent equivalent. The lead pays for planning, every subagent pays for its own context and tools, and every result returns through the lead for synthesis. That is the bill.

The bill is also the feature. The orchestrator is not doing the cognitive work. It is the part that knows what work has been done. Compare that with what you get for free in choreography: nothing. Every event is its own atom. There is no place where "the run" exists. Reconstructing it after the fact is what those extra tokens are buying you up front.

In domains where every decision needs an audit trail (finance, healthcare, anything the regulator can subpoena) the calculation is forced. You cannot ship a system whose answer to "which agent introduced the error" is a join across five event logs and a vibe. Orchestration is what you reach for, and the token tax is what you pay.

The hybrid that ships

Pure orchestration loses on latency. Every handoff is a synchronous round-trip through the centre. For research-style work that is fine, since the reasoning is the work and the time is dominated by model calls anyway. For tasks with a wide fan-out of independent subtasks it is wasteful.

The shape that has settled in production is neither of the two extremes. It is an orchestrator at the trunk, with bounded local meshes at the leaves.

// Trunk: an orchestrator that decides strategy, not tactics.
const subqueries = await planner.split(query);

const branchResults = await Promise.all(
  subqueries.map((q) => runBranch(q)),
);

return synthesise(branchResults);

// Leaf: peers may talk freely; the gate is the only way out.
async function runBranch(q: Subquery): Promise<BranchResult> {
  const local = mesh([searcher, factChecker, tagger]);
  return await local.runUntilGate(q, {
    maxRounds: 4,
    tokenBudget: 50_000,
  });
}

The orchestrator decides the strategy: split this query, hand it to three branches, synthesise the results. Each branch has internal freedom. A fact-checker and a citation-finder can chat directly, because they share a context and the blast radius is bounded by the gate. Phase gates between branches do the same job that compensating transactions do in sagas: they let the system decide, at known points, whether to continue, retry, or escalate. The agents inside a phase can be as chatty as they like. The gate is where the run becomes legible again.

Microsoft's agent orchestration patterns documentation names five primitives (sequential, concurrent, group chat, handoff, magentic) but the underlying split is the same. The first three are flavours of orchestration. Group chat and magentic are bounded meshes. There is no entry called "free-form choreography". The pattern did not survive the move from demo to production.

Where this leaves us

Reach for choreography when the agents are genuinely peers, the task is parallel and read-mostly, and a single false belief cannot poison a downstream agent's prompt. Reach for orchestration in every other case. When in doubt, orchestrate. The token tax is the price of being able to answer "what just happened" out loud.