Article
AI Agent Workflow Patterns: The Six Designs That Work in Production
Six proven AI agent workflow patterns — with real examples, cost shapes, and failure modes. Skip the theory. Build the one that fits your use case.
Most AI agent projects fail before they ship. Not because the model is wrong. Because the workflow design is wrong.
People reach for complex multi-agent architectures when a single-step loop would do the job. Or they wire up a research agent when what they actually need is a parser. Wrong pattern, wasted build time, flaky results in production.
There are six AI agent workflow patterns that actually hold up. Know them. Pick the right one.
Pattern 1: Classify-then-Route
The workflow: Agent reads the input — a support ticket, a lead form, a document — assigns it to a category, then routes it to the right handler or sub-agent.
Why it works: You run a cheap, fast model on the classify step. GPT-4o mini or Haiku for the routing decision; your heavy model only touches the complex categories. A support ticket system handling 10,000 tickets a month might spend $3 on classification and $40 on the 8% of tickets that need real reasoning.
Real example: B2B SaaS company routes inbound support tickets to five queues — billing, bugs, onboarding, feature requests, and "other." The classifier runs on a 3-sentence system prompt. The onboarding queue triggers a multi-step troubleshooting agent. The billing queue pings a human. Total cost per ticket: under $0.002.
Use our cost calculator to model classify-and-route spend before you build.
Failure mode: The classifier is undertrained. It collapses edge cases into the wrong category, and your expensive downstream agent runs on garbage. Fix it with a confidence threshold — anything below 0.85 routes to "human review."
Pattern 2: Extract-then-Act
The workflow: Agent reads a document, email, or web page. Extracts structured data. Writes it to a database, triggers an action, or hands off to the next step.
Why it works: This is the "agent as structured parser" pattern. It's reliable because you're not asking the model to reason — you're asking it to read and format. Consistent inputs produce consistent outputs.
Real example: Accounts payable team runs every inbound invoice through an extraction agent. It pulls vendor name, line items, amounts, due date, and PO number. Writes to Xero. Flags mismatches for human review. What took 12 minutes per invoice now takes 30 seconds. Error rate dropped from 4% to under 1%.
Failure mode: Unstructured inputs break your schema. A PDF scanned at 300dpi with a crooked table will hallucinate line items. Build a validation step: if extracted totals don't match the sum of line items, route to human review automatically.
Pattern 3: Research-then-Synthesise
The workflow: One or more agents gather information from multiple sources — web search, internal docs, databases, APIs. A synthesis agent reads all of it and produces the final output.
Why it works: You parallelise the gathering. Five research sub-agents run simultaneously, each pulling from a different source. The synthesis step is where model quality matters most — use your best model here.
Real example: Competitive intelligence team runs a weekly report on three competitors. Research agents pull from their pricing pages, recent press releases, LinkedIn job postings, and G2 reviews. Synthesis agent writes a structured 500-word brief. What took a half-day of analyst time now runs overnight and lands in Slack every Monday morning.
For multi-agent coordination, see how AI agent orchestration handles the handoffs between gather and synthesise steps.
Failure mode: Research agents return conflicting data. The synthesis agent averages them instead of flagging the conflict. Build an explicit conflict-detection step: if two sources disagree on a material fact, surface both and let the human decide.
Pattern 4: Draft-then-Review
The workflow: Agent produces a draft. A second agent — or a human — reviews and approves before any output leaves the system.
Why it works: It makes agents safe in customer-facing roles. The drafter can be fast and cheap. The reviewer catches tone issues, factual errors, and brand violations before they hit a customer inbox.
Real example: E-commerce brand uses a draft-then-review loop for customer service replies. Draft agent handles 800 tickets a day. A review agent checks each response against a 12-point quality rubric. Anything below threshold gets flagged for a human agent. Human workload dropped 70%. Brand complaints about tone dropped 40%.
Failure mode: Review criteria drift. You define the rubric once, it ages, and the reviewer starts approving things that would have failed six months ago. Schedule a monthly audit of flagged-then-approved tickets to catch drift early.
Pattern 5: Monitor-then-Alert
The workflow: Agent runs on a schedule. Reads a feed — transactions, brand mentions, metrics, logs. Flags anomalies. Pings a human only when something needs attention.
Why it works: Humans don't need to see the normal. They need to see the exceptions. This pattern scales a single person's attention across hundreds of data streams.
Real example: Fintech startup monitors 15,000 daily transactions for fraud signals. Agent runs every 15 minutes. Checks each batch against 22 behavioral rules. Pages the on-call analyst when it finds a cluster. Before: two analysts reviewed every transaction. After: the same two analysts handle six escalations a day and spend the rest of their time on higher-value work.
Failure mode: Alert fatigue. The agent flags too many false positives, humans start ignoring the pings, and a real incident slips through. Tune aggressively. Track your false positive rate. If it's above 20%, the model's threshold is wrong.
Pattern 6: Plan-then-Execute
The workflow: Agent receives a complex goal. Breaks it into a sequence of steps. Executes each step. Handles failures and replans when something breaks.
Why it works: It's the only pattern that can handle genuinely open-ended tasks — "research this company and write a full investment memo" or "refactor this codebase to remove the deprecated API."
Real example: Engineering team uses a plan-then-execute agent to handle database migrations. Agent reads the migration spec, generates a step-by-step plan, runs each step against a staging environment, checks outputs, and flags steps that fail before touching production. Migrations that took a senior engineer three hours now take 40 minutes of human review time.
This is where frameworks earn their keep. LangGraph, CrewAI, and the newer options handle the state management and replanning logic you'd otherwise build yourself. If you're running multiple plan-then-execute workflows across a team, Paperclip handles the orchestration layer.
Failure mode: The plan is wrong and the agent executes confidently anyway. Add a plan-review step before execution starts. Show the plan to a human (or a critic agent) and require approval before any irreversible actions run.
Choosing the Right Pattern
Don't start with the most complex pattern you can imagine. Start with the simplest one that solves the problem.
Decision prompt:
- Does the input need to be sorted into categories? Start with classify-then-route.
- Does the input need to be parsed into structured data? Use extract-then-act.
- Do you need information from multiple sources before writing anything? Use research-then-synthesise.
- Is the output customer-facing or high-stakes? Add a draft-then-review layer.
- Are you watching for exceptions in a continuous data stream? Build monitor-then-alert.
- Is the goal genuinely open-ended — too complex for the first five patterns? Then reach for plan-then-execute.
Most production workflows are a combination of two or three of these — a classify-then-route outer shell with a research-then-synthesise core, for instance. But build the core pattern first. Composition comes later.
If you're building without code, Lindy handles patterns 1 through 5 well. For patterns that need custom logic, n8n gives you the workflow builder without locking you into a black box.
The teams who ship reliable agents aren't the ones with the most sophisticated architectures. They're the ones who picked the right pattern for the job and built it simply.
Pick the pattern. Build it lean. Add complexity only when the simple version breaks.
About the author

Lucas Powell
Founder, Growth 8020Founder of Growth 8020. Started Agent Shortlist as the publication he wished existed when his team had to pick AI tools.
More in this series
AI Agent Frameworks in 2026: LangGraph, CrewAI, AutoGen, and What to Pick
LangGraph, CrewAI, AutoGen, Semantic Kernel — compared for builders. Which framework fits your workflow, and when to skip all of them entirely.
AI Agent Guardrails: How to Not Delete Your Database in 9 Seconds
The safety checklist every AI agent deployment needs — human approval gates, action boundaries, and why 9 seconds was all it took to end Pocket OS.
AI Agent Model Routing: Cut Your API Bill by 60% Without Losing Quality
Brain-and-muscle model routing: use expensive models for planning, cheap models for execution. Real cost breakdowns and the routing logic that makes it work.