claude-code·9 min read·
Vercel WDK + Claude Code: durable workflows for UK indie hackers (2026)
Vercel's open-source Workflow Development Kit turns ordinary async functions into durable, crash-safe workflows. Here is how to use it as an orchestration layer for Claude Code agents - and when it beats systemd timers or Anthropic Routines for a UK indie hacker.

A real story. A UK indie hacker is running a daily research-then-publish agent on a Vercel cron. The agent calls Claude Code to pull source data, then again to write the post, then again to review it, then posts to the CMS. Most days it works. Some days the second Claude call times out, the third never fires, and the cron tries again the next day with no state. You either babysit it or you redesign it.
The redesign now has a clean answer. Vercel shipped the Workflow Development Kit (WDK) as open-source public beta - two directives that turn ordinary TypeScript async functions into durable workflows. Each step is checkpointed automatically. The workflow can pause for minutes or months, survive deployments and crashes, and resume exactly where it stopped. For a UK indie hacker orchestrating Claude Code agents on Vercel, this is the missing layer between a flaky cron and a paid orchestration platform.
This post is what WDK actually is, when it beats systemd timers or Anthropic Routines for an indie hacker, the GBP cost line on a Vercel Pro plan, and a worked example of a 3-step Claude Code workflow with retries and a human-in-the-loop pause.
What WDK is, in plain English
Two TypeScript directives. The first marks an async function as a workflow. The second marks individual blocks inside it as durable steps. WDK takes care of the queueing, retry, persistence, and crash recovery underneath. You write code that looks like ordinary async code.
"use workflow";
import { step } from "workflow";
export async function research(topic: string) {
const data = await step("pull-sources", async () => {
return await pullSources(topic);
});
const draft = await step("write-draft", async () => {
return await callClaude("Write a draft about: " + JSON.stringify(data));
});
const review = await step("review-draft", async () => {
return await callClaude("Review this draft: " + draft);
});
return { draft, review };
}
Each step is checkpointed to durable storage. If the function instance dies after pull-sources but before write-draft, the next invocation resumes at write-draft with the cached data already available. If write-draft fails, the retry runs only write-draft again.
That last property is the entire point for Claude Code agents. A research-write-review flow is three model calls plus tool calls. Each one can fail, time out, or hit a rate limit. Without durability, a mid-flow failure costs you all the tokens already spent on earlier steps. With WDK, you pay each step once.
When WDK beats systemd timers or Routines
The Linux scheduler decision matrix covered single-shot scheduled jobs. For multi-step orchestration, the picture is different.
| Dimension | systemd timers | Anthropic Routines | Vercel WDK |
|---|---|---|---|
| Best for | Single-shot scheduled claude -p | Cloud-hosted single jobs | Multi-step workflows |
| Checkpointed per-step | No | No (whole-Routine retry) | Yes |
| Pause-resume mid-flow | No | No | Yes (minutes to months) |
| Human-in-the-loop wait | No | No | Yes (suspend until webhook) |
| Where it runs | Your VPS | Anthropic cloud | Vercel function |
| Monthly cost (orchestrator) | GBP 0 (+ VPS) | Pool B usage | Included in Pro at GBP 16 |
| Setup time | 10 minutes | 5 minutes (console) | 20 minutes (Next.js project) |
| Survives Vercel redeploy | N/A | N/A | Yes |
Three jobs are clearly WDK-shaped:
- Multi-step Claude agent runs - research to write to review to publish, where each step is a separate model call. WDK checkpoints between, you pay each step once.
- Human-in-the-loop pauses - workflow runs to a checkpoint, suspends, waits for a webhook (human approval, payment confirmation, external service callback), resumes. Could be minutes, could be days. WDK persists state across the gap.
- Crash-safe agent loops on Vercel - your Vercel deployment redeploys five times a day during heavy development. Without WDK, every in-flight workflow dies on redeploy. With WDK, every in-flight workflow survives and resumes.
Two jobs are clearly NOT WDK-shaped:
- Single-shot scheduled jobs - a 6:55 daily brief that is one
claude -pcall. systemd timer on a Hetzner VPS is two files, GBP 3.30/mo, done. WDK is overkill. - Real-time interactive agents - a Slack bot that responds in 2 seconds. The whole workflow runs inside a single function invocation, durability buys nothing.
The GBP cost line on Vercel Pro
The honest comparison. A typical indie hacker workflow looks like 3-5 Claude steps, runs once or twice a day, totals maybe 50 invocations a month. The cost shape:
WDK orchestration cost: GBP 0. The two directives compile to standard Vercel function calls. Each step is a function invocation. A Pro plan at GBP 16/mo includes 1 million invocations and 1,000 GB-hours of compute. Fifty multi-step runs use a few hundred invocations a month at most.
Claude spend (Pool B from 15 June): the same model-call cost regardless of orchestrator. A Sonnet research-write-review flow at ~15k tokens total is roughly 9p per run. Fifty runs a month is GBP 4.50. The credit split guide covers what changes on 15 June.
Durable storage: also included in Pro. Step checkpoints are small JSON blobs, well under the included allowance.
Total: GBP 16 (Vercel Pro, you almost certainly already pay this) + GBP 4.50 (Claude spend) = GBP 20.50/mo for a fully durable, crash-safe, multi-step agent orchestrator. Compare with paid orchestration platforms (Temporal Cloud, Inngest, Restate) that start at USD 99/mo.
WDK does not add a multiplier. The cost is whatever Claude burns, which would land in Pool B from any orchestrator. The orchestrator itself is free.
A worked example: 3-step research-write-publish workflow
The full pattern, end to end. A daily workflow that pulls source data on a topic, drafts a post, waits for human approval, and publishes. Each step is durable. The approval wait can take an hour or a week.
"use workflow";
import { step, suspend } from "workflow";
import { exec } from "child_process";
import { promisify } from "util";
const execAsync = promisify(exec);
async function callClaude(prompt: string): Promise<string> {
const { stdout } = await execAsync(
`claude -p ${JSON.stringify(prompt)}`,
{ maxBuffer: 10 * 1024 * 1024 }
);
return stdout;
}
export async function dailyContentRun(topic: string) {
// Step 1: pull source data. Retried automatically on transient failure.
const sources = await step("pull-sources", async () => {
return await pullSources(topic);
}, { retries: 3 });
// Step 2: ask Claude to draft. Checkpoint stores the draft.
const draft = await step("write-draft", async () => {
const prompt = `Write a 1,500-word post about ${topic}. Sources: ${JSON.stringify(sources)}`;
return await callClaude(prompt);
}, { retries: 2 });
// Step 3: human approval pause. Workflow suspends until webhook hits.
const approval = await suspend("await-approval", {
timeoutMs: 7 * 24 * 60 * 60 * 1000, // 7 days
});
if (!approval.approved) {
return { status: "rejected", reason: approval.reason };
}
// Step 4: publish to the CMS. Idempotent so retries are safe.
const published = await step("publish", async () => {
return await publishToCMS({
title: approval.title || topic,
body: draft,
});
}, { retries: 3 });
return { status: "published", url: published.url };
}
What this gives you. The pull-sources call retries three times on transient failure (network blip, source API throttle). The write-draft call retries twice - Anthropic transient 5xx errors fix themselves, and the second attempt does not re-run pull-sources. The suspend call halts the workflow until your approval webhook fires; the workflow occupies no compute while suspended, and the state is durable for the full 7-day window. If Vercel redeploys between steps, the next invocation picks up at the right step with all earlier state intact.
The webhook handler is a standard Next.js API route that resumes the suspended workflow:
// app/api/approve/route.ts
import { resume } from "workflow";
export async function POST(req: Request) {
const { runId, approved, title, reason } = await req.json();
await resume(runId, { approved, title, reason });
return Response.json({ ok: true });
}
The trigger that fires the workflow daily is a Vercel cron pointed at an API route that starts the workflow:
// app/api/cron/route.ts
import { dailyContentRun } from "@/workflows/daily-content";
export async function GET() {
const run = await dailyContentRun.start("UK indie hacker tax 2026");
return Response.json({ runId: run.id });
}
Three small files, one Vercel cron entry, the whole pipeline is durable and resumable. Compare with the original cron-only version where any failure between steps loses all earlier work.
Crash recovery and what it costs you
A Vercel redeploy mid-workflow used to be a disaster. With WDK, it is a non-event.
Each step writes its output to durable storage as soon as it returns. When Vercel kills the function instance for a redeploy, the workflow run pauses. The next time the run is touched (by a resume signal, a retry timer, or an explicit poke), a fresh function instance picks up at the next un-checkpointed step. Earlier steps are not re-run.
For Claude Code agents this is the difference between "we ate the Opus tokens for nothing" and "we picked up where we left off". On a research-write-review flow with ~15k tokens spread across three model calls, a mid-flow crash without durability costs you the tokens already spent. With WDK, you only re-spend on the failed step.
The cost line is therefore lower with WDK than without, not higher. Each Claude call lands once on Pool B, even when Vercel itself is flaky.
The honest WDK downsides
Three caveats worth flagging.
Vercel-shaped, not Linux-shaped. WDK is open source and can run anywhere Node runs, but the production-grade durable storage and dashboard live on Vercel. If you are committed to a Hetzner VPS with systemd, the matrix tilts back toward systemd timers for the simpler jobs.
TypeScript and Node only (today). No Python, no Bash. If your Claude Code orchestration is shell-script-heavy, you wrap the shell calls in Node child processes. Doable but adds a thin layer.
Step budgets are real. Each step runs inside a Vercel function and shares the 15-minute function timeout limit on Pro. Long-running agent steps need to be broken into smaller WDK steps. The good news is that's the same discipline you should be using anyway - short, idempotent steps are easier to reason about and retry.
The UK indie hacker default
The recommendation, after running both WDK and pure systemd timer setups in production this month.
- WDK for multi-step Claude Code workflows on Vercel - research to write to review to publish, anything with a human-in-the-loop pause, anything that needs to survive redeploys.
- systemd timers on a Hetzner VPS for single-shot scheduled
claude -pjobs. Two files, GBP 3.30/mo, journalctl observability, OnFailure handlers to Slack. - Anthropic Routines for jobs that must survive laptop close where you do not want a server. Pool B cost is the trade for the convenience.
The right way to think about it: pick by the shape of the workflow, not by religious attachment to one tool. Single-shot scheduled job - systemd. Cloud-hosted no-server job - Routines. Multi-step workflow that needs to survive crashes and pauses - WDK. Most production indie hacker setups end up running all three, each for the jobs they fit.
The wider point. Vercel WDK fills the orchestration gap that has been the awkward part of running Claude Code agents in production. It's free, open source, included in the Pro plan you almost certainly already pay for, and turns three brittle Claude calls in a row into a durable, crash-safe, resumable workflow. For a UK indie hacker, that is the missing piece between "the agent works most of the time" and "the agent works".
New here? IdeaStack publishes one deeply researched UK business opportunity every Thursday - real keyword data, competitor analysis, builder prompts. See the latest free report.
Frequently asked
What is the Vercel WDK and how does it relate to Claude Code?
The Workflow Development Kit (WDK) is Vercel's open-source TypeScript framework that turns ordinary async functions into durable, crash-safe workflows. Two directives mark a function as a workflow and individual steps inside it as durable. Each step is checkpointed automatically, so the workflow can pause for minutes or months, survive a redeploy, and resume exactly where it stopped. For Claude Code agents this matters because long-running multi-step agent jobs (research, write, review, publish) often blow up in the middle and you lose the partial work. WDK gives you a free retry-and-resume layer without writing the durability code yourself.
When should I use WDK instead of systemd timers or Anthropic Routines?
Use WDK when the job is multi-step and you want each step checkpointed, when you need human-in-the-loop pauses (wait for an approval, wait for a webhook), or when the job is already running on Vercel. Use systemd timers when the job is a single shot scheduled invocation of `claude -p` on your own Linux box - the durability is overkill and the systemd OnFailure handler is enough. Use Anthropic Routines when you do not have a server at all and want the Anthropic console to manage everything. The three patterns target different jobs and most production setups end up running all three.
Does WDK cost extra on top of my Vercel Pro plan in GBP?
WDK itself is open source and free. The runtime cost is Vercel function execution time, which is included in your plan's allowance. A Pro plan at GBP 16 a month includes 1 million function invocations and 1,000 GB-hours of compute, which is enough for thousands of Claude Code workflow runs a month for a typical indie hacker. The cost line that matters is the Claude spend itself, which from 15 June 2026 lands in Pool B regardless of whether you call it from WDK, systemd, or Routines. WDK does not add a multiplier.
What happens to a WDK workflow if Vercel redeploys mid-run?
It pauses and resumes. WDK persists every step's output to durable storage, so when the function instance is killed by a redeploy, the next invocation picks up at the next un-checkpointed step. This is the killer feature for Claude Code agents. A 12-step research workflow that fails at step 9 because of a transient API error retries from step 9, not step 1. You do not pay for the work the model already did.
Can I run Claude Code subprocesses from inside a WDK workflow?
Yes, with the usual headless caveats. The WDK step calls a Node child process that invokes `claude -p` (or uses the Agent SDK directly). The trick is keeping each step short enough to fit inside Vercel function timeouts (15 minutes for Pro plan). For longer agent runs, break the work into smaller WDK steps and use the Agent SDK in process. Each step's output is the durable checkpoint, so even Opus-heavy multi-step flows resume cleanly.




