claude-code·9 min read·15 May 2026

Multi-agent Claude Code orchestration patterns for UK indie hackers (2026)

One agent is enough until it is not. The moment your build has a planning step, a building step, and a review step, you want orchestration. Here is the UK indie hacker playbook for multi-agent Claude Code: the five patterns that earn their keep, the ones that do not, and a weekend morning-brief worker pool you can ship.

By IdeaStack

Multi-agent Claude Code orchestration patterns for UK indie hackers (2026)

There is a moment in every Claude Code build where one agent stops being enough. You can feel it. The session is doing four jobs at once — researching, planning, writing, reviewing — and the quality of each is dropping because none of them has its own context. The agent has the planner's hat, the builder's hat, the editor's hat, and the QA hat all balanced on at the same time, and it is starting to lose the thread.

That is the moment multi-agent orchestration earns its keep. Not before. The thing nobody tells UK indie hackers is that 90% of Claude Code work is one-shot and should stay one-shot — multi-agent is overkill for "fix this bug" and a distraction for "add this Zod schema". But the remaining 10% — content pipelines, research scans, ops automations, anything with genuinely independent subtasks — is where you stop adding agents one at a time and start designing how they talk to each other. This post is the playbook for that.

The building block: orchestrator and subagent

Every multi-agent Claude Code setup is built from one pattern: an orchestrator dispatches to subagents via the Task tool. The orchestrator holds the plan. The subagents do the work. The orchestrator sees only the subagent's final output, not its scratch work — which is the whole point. Each subagent gets a fresh context window and burns its own tokens, leaving the orchestrator's window clean.

The minimal shape with the Claude Agent SDK in TypeScript:

import { query } from "@anthropic-ai/claude-agent-sdk";

const response = query({
  prompt: `
    You are an orchestrator.
    Dispatch one subagent to summarise notes/topic-a.md
    and another to summarise notes/topic-b.md.
    Combine their outputs into a single brief at notes/brief.md.
  `,
  options: {
    model: "claude-opus-4-7",
    allowedTools: ["Task", "Read", "Write"],
    maxTurns: 8,
  },
});

for await (const message of response) {
  if (message.type === "assistant") console.log(message.message.content);
}

The agent decides when to dispatch a subagent, what brief to hand it, and what to do with the result. You did not write a fan-out loop. You described the work and gave the agent the Task tool.

For everything below, that orchestrator-and-subagent shape is the lego brick. The patterns are different ways of stacking it.

Pattern 1: planner and builder

The single highest-value pattern for a UK indie hacker. One subagent plans the work in detail (no code, no edits — pure reasoning). A second subagent executes the plan (no reasoning, just implementation). The orchestrator hands the plan from the first to the second.

Why it works: planning and building need different headspaces. A bloated planner-builder hybrid will start writing code before the plan is sound, then realise mid-build that the plan was wrong, then patch the plan, then patch the build, then run out of context. Splitting them gives each subagent one job and a clean window.

A bash version with claude -p:

# Step 1: planner produces a plan
PLAN=$(claude -p \
  --allowedTools "Read,Grep,Glob" \
  --max-turns 6 \
  --output-format json \
  "Read CLAUDE.md and src/, then produce a detailed step-by-step plan for adding Stripe subscription billing. Do not write code." \
  | jq -r '.result')

# Step 2: builder executes the plan
echo "$PLAN" | claude -p \
  --permission-mode acceptEdits \
  --allowedTools "Read,Write,Edit,Bash" \
  --max-turns 30 \
  "Execute this plan exactly. Do not deviate."

The planner is sharp because it only has to plan. The builder is sharp because it only has to build. The token total is usually lower than one mega-session, and the work is better.

Pattern 2: round-robin reviewer

Three subagents grade the same output against the same rubric, then the orchestrator takes the majority vote (or the median score). Borrowed from how LLM evals are run in production, surprisingly useful for solo founders who do not have a second pair of eyes.

import { query } from "@anthropic-ai/claude-agent-sdk";

const orchestratorPrompt = `
  Read draft.md. Dispatch three subagents in parallel, each with this brief:
  "Grade this draft from 0-10 on clarity, usefulness, and UK-builder fit.
   Return only the three numbers as JSON."
  When all three have replied, return the median score for each dimension
  and a one-paragraph summary of where they agreed and disagreed.
`;

const response = query({
  prompt: orchestratorPrompt,
  options: {
    model: "claude-opus-4-7",
    allowedTools: ["Task", "Read"],
    maxTurns: 10,
  },
});

For a UK indie hacker shipping blog posts, landing-page copy, or research reports, a round-robin grader is the closest thing to a free editorial team. Three Sonnet subagents grading a draft costs about 2p to 5p per review at current pricing.

Pattern 3: fan-out, fan-in

The pattern for content and research pipelines. One topic list goes in. The orchestrator fans out a subagent per topic. Each subagent researches its topic. Their outputs fan back in to a synthesiser subagent, which produces the final brief. Then optionally a writer subagent that turns the brief into prose.

In bash with claude -p and xargs:

#!/usr/bin/env bash
set -euo pipefail

mkdir -p research

# Fan out: one worker per topic in parallel
cat topics.txt | xargs -P 5 -I {} bash -c '
  topic="$1"
  slug=$(echo "$topic" | tr " " "-")
  claude -p \
    --allowedTools "WebFetch,Read" \
    --max-turns 5 \
    --output-format json \
    "Research this topic for a UK audience: $topic. Return 5 bullets with sources." \
    | jq -r ".result" > "research/$slug.md"
' _ {}

# Fan in: synthesiser
cat research/*.md | claude -p \
  --allowedTools "" \
  --max-turns 4 \
  "Synthesise these research notes into a single 600-word UK morning brief. Markdown output."  > brief.md

The -P 5 on xargs says "run up to five workers in parallel". For five topics, the whole pipeline finishes in roughly the time it takes the slowest single worker to run, not the sum. That is the unlock.

Pattern 4: the long-running daemon

A loop that polls a queue, pulls one job, dispatches an orchestrator, marks the job done, sleeps, repeats. Less a pattern than a wrapper — the inside is any of the patterns above — but it is how you turn a one-shot pipeline into a service.

import { query } from "@anthropic-ai/claude-agent-sdk";
import { createClient } from "@supabase/supabase-js";

const supabase = createClient(
  process.env.SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_KEY!
);

async function processOne() {
  const { data: job } = await supabase
    .from("jobs")
    .select("*")
    .eq("status", "pending")
    .limit(1)
    .single();

  if (!job) return false;

  const response = query({
    prompt: `Process job ${job.id}: ${job.brief}`,
    options: {
      model: "claude-opus-4-7",
      allowedTools: ["Task", "Read", "Write", "WebFetch"],
      maxTurns: 12,
    },
  });

  let final = "";
  for await (const m of response) {
    if (m.type === "assistant") final = m.message.content.toString();
  }

  await supabase
    .from("jobs")
    .update({ status: "done", result: final })
    .eq("id", job.id);

  return true;
}

while (true) {
  const did = await processOne();
  await new Promise((r) => setTimeout(r, did ? 1_000 : 30_000));
}

Wrap that in a Railway worker or a pm2 process on your UK desktop and you have an ops automation that keeps running while you sleep. Same shape works for newsletter generation, batch summarisation, support-ticket triage, or content moderation.

This Week's Free Business Idea

Audit UK sites for DMCCA fake-review and drip-pricing breaches

Find the breaches, get the policy, for £39 a month

7.2/10Read the full breakdown →

When this is overkill

Most Claude Code work is one-shot. A bug fix, a single-file edit, a schema change, a refactor of one route — all of these are happier as a single session than a multi-agent dance. The honest rule of thumb: if you would not split the work across two human contractors, do not split it across two agents.

Signs you do need multi-agent:

The work has genuinely parallel subtasks (research over 10 topics, summarise 50 PDFs).
Planning and execution are eating each other's tokens in a single session.
You want a fresh-context reviewer or grader on top of the build.
The job runs unattended and you want isolation between steps.

Signs you do not:

The task fits in one 10k-token session.
The "subagents" would all need the same context anyway.
You are doing it because the architecture is fun, not because the build needs it.

UK builder angle: cost, hosting, context budgets

The honest pricing shape, with current Opus and Sonnet prices in mind:

Single agent, 10k context: roughly 5p to 15p per run on Opus, less on Sonnet.
Orchestrator plus three subagents, 10k each: roughly 15p to 50p per run.
A morning daemon doing five fan-out and one fan-in, weekdays for a month: GBP 5 to GBP 20.

That is the trade-off. You pay for the parallel workers, but you usually save on the orchestrator's window because it never has to hold everyone's scratch work.

Context budget per subagent, as a working number: target under 10k tokens per worker. If a subagent needs more than that, you have given it the wrong brief — break it down further or fold it back into the orchestrator. (For why context windows beyond 80% turn into quality cliffs, see the memory and context management guide.)

Hosting: a UK Windows PC with the SDK or Claude Code CLI installed handles solo-founder workloads for free. Step up to a Railway worker (~GBP 5/mo) when you want it always-on. Vercel cron for daily fan-out jobs that finish in under a minute. Anything heavier wants a dedicated VPS.

What can go wrong, and how to plan for it

Orchestrator drift. A long-running orchestrator session can fill its own context with subagent outputs and lose the thread. Mitigation: keep orchestrator runs short, have it write intermediate state to disk, and restart between orchestrator runs.

Subagent runaway. A subagent that decides to dispatch its own subagents can fan out wider than you intended and burn tokens fast. Mitigation: explicitly forbid the Task tool in subagent briefs. allowedTools: ["Read", "Write"] and no "Task" keeps workers from recursing.

Coordination overhead. Past about five concurrent workers, the orchestrator spends more tokens managing than the workers spend doing. Mitigation: chain orchestrators rather than widening one. Three workers per orchestrator, three orchestrators in series, beats nine workers in parallel for most jobs.

Rate limits. Anthropic will rate-limit parallel fan-out, especially on a new key. Mitigation: cap parallelism (-P 3 not -P 20), implement exponential backoff in the orchestrator, and watch for HTTP 429s in your logs.

Cost shock. First time you run a fan-out, the bill jumps. Mitigation: hard daily spend cap on the Anthropic key, set in the console before the first cron run. GBP 5/day for a solo founder is plenty of headroom; GBP 50 is the wrong order of magnitude.

The weekend build: a morning brief worker pool

A concrete UK weekend project: a morning brief generator that fans out one researcher subagent per topic from a daily topics list, fans the results into a synthesiser, and emails you the digest at 7am.

The shape:

topics.txt with five topics — say, "UK fintech funding this week", "Companies House notable filings", "UK indie hacker launches on Reddit", "interesting GitHub repos from UK accounts", "BBC tech headlines".
A bash or TypeScript orchestrator that fans out five claude -p (or SDK query) workers in parallel.
A synthesiser subagent that merges the five outputs into one 600-word brief.
A Resend or Gmail call to email the brief to your inbox.
A cron entry at 06:55 weekdays.

Token budget: five workers at ~5k tokens each plus a synthesiser at ~8k. Roughly 10p to 25p per run, GBP 2 to GBP 5 per month on Sonnet. Hosting cost: zero if it runs on your home PC, GBP 5/mo on Railway. Setup time: a weekend afternoon.

When the first brief lands on Monday at 7am — written by an orchestrator and five workers you wrote on Sunday — multi-agent stops being theoretical.

If you want a UK opportunity that fits this orchestrator-and-worker shape — a niche where multi-agent automation turns into a defensible product — start with this week's free report. The research is done, the patterns are yours, the build is the weekend.

Frequently asked

When does multi-agent orchestration actually earn its keep?

When you have genuinely independent subtasks that can run in parallel, when you want a separation between planning and execution, or when you want a fresh-context reviewer. For a single file edit or a one-shot script, multi-agent is overkill. The honest rule of thumb: if you would not split the work across two human contractors, do not split it across two agents.

How do subagents share context with the orchestrator?

They mostly do not, and that is the point. Each subagent gets a fresh context window and the brief you hand it. Results come back through the orchestrator's Task tool response. This isolation is exactly what makes multi-agent setups cheaper and more reliable - each worker stays under 10k tokens instead of one bloated 80k session.

Does each subagent cost extra in tokens?

Yes - each subagent has its own context window and bills separately on your Anthropic API key. The trade-off is that five 10k-token workers usually beat one 80k-token mega-session on both quality and total cost, because the mega-session reloads the full history on every turn. Parallelism pays for itself if you size the briefs tightly.

Can I run multi-agent orchestration with claude -p alone, or do I need the SDK?

Both work. For ad-hoc fan-out you can script several claude -p calls in parallel from bash. For anything stateful, recurring, or with shared memory, the Claude Agent SDK is cleaner because you control the loop in code. Most UK indie hackers start with bash and migrate to the SDK once the orchestrator outgrows it.

How many subagents is too many?

Past about five concurrent workers, coordination overhead eats the gains. Quality drops, debugging gets painful, and you start hitting Anthropic rate limits. For a UK indie hacker building solo, three to five workers per orchestrator run is the sweet spot. If you need more, chain orchestrators rather than widening one.

Filed under

Claude Code & AI Tools