Agent Patterns

PreviousNext

How to design reliable, flexible, and simple AI agents without overengineering.

Agent, in Plain Words

An agent is just an LLM that chooses the next step instead of following a fixed script.

Most systems you build land in one of two buckets:

  • Workflows — predefined steps, always the same. Think recipe cards.
  • Agents — the model plans as it goes. Think a chef who understands the goal and improvises.

Workflows vs. Agents


When (and When Not) to Use Agents

Agents can explore, recover, and adapt. That flexibility costs more tokens, more latency, and more chances for bugs. Start with the smallest thing that could work.

Start-simple checklist

  • Try one direct LLM call.
  • Add retrieval or a better prompt if quality still lags.
  • Move to a workflow when you need multiple fixed steps.
  • Graduate to an agent only when the path to the answer is fuzzy.

Frameworks: Helpful, But Not Magic

Frameworks such as LangGraph, Bedrock Agents, Rivet, or Vellum save setup time, but they also hide details. If you do not understand the hidden parts, you cannot fix them.

Before you grab a framework:

  • Build the loop yourself with the raw API; it usually fits on one screen.
  • If you adopt a framework, trace each step and log everything it does.

Building Blocks of Agentic Systems

Every agent is an augmented LLM: a base model with helpers wrapped around it.

The Augmented LLM

The model gets three core helpers:

  • Retrieval finds fresh context.
  • Tools take actions in your world.
  • Memory keeps track of state across turns.

Your agent needs to know three things:

  1. When to reach for each helper.
  2. How to pass data in and out without dropping fields.
  3. How to spot a bad result and try again.

Example: Simple Augmented LLM

augmented-llm-example.ts
import { Experimental_Agent as Agent, tool } from "ai"
import { z } from "zod"
 
const augmentedAgent = new Agent({
  model: "openai/gpt-4o",
  system: `You are a helpful assistant with access to search and document tools.
  
  When answering questions:
  1. Always search for relevant information first
  2. Use document analysis for detailed information
  3. Cross-reference multiple sources before drawing conclusions
  4. Cite your sources when presenting information`,
  tools: {
    search: tool({
      description: "Search for information on the web",
      parameters: z.object({
        query: z.string().describe("The search query"),
      }),
      execute: async ({ query }) => {
        // Simulate search API call
        return { results: `Search results for: ${query}` }
      },
    }),
    analyzeDocument: tool({
      description: "Analyze a document for key information",
      parameters: z.object({
        documentId: z.string().describe("The document ID to analyze"),
      }),
      execute: async ({ documentId }) => {
        // Simulate document analysis
        return { analysis: `Analysis of document ${documentId}` }
      },
    }),
  },
})
 
// Use the agent
const result = await augmentedAgent.generate({
  prompt: "What are the latest trends in AI?",
})
 
console.log(result.text)

Core Patterns (Simple → Flexible)

You only need a handful of patterns. Pick the smallest one that solves the job today.

1) Prompt Chaining — split the job

Break a tough task into a few easy ones with checks in between.

Use when: one-shot answers drift or hallucinate. Chaining adds structure without much overhead.

Example: Prompt Chaining Implementation

prompt-chaining.ts
import { Experimental_Agent as Agent, stepCountIs, tool } from "ai"
import { z } from "zod"
 
const chainingAgent = new Agent({
  model: "openai/gpt-4o",
  system: `You are a content creation assistant that works in stages.
  
  Process:
  1. Create a detailed outline first
  2. Write a draft based on the outline
  3. Polish and refine the content
  
  Always evaluate quality at each step and retry if needed.`,
  tools: {
    createOutline: tool({
      description: "Create a detailed outline for the given topic",
      parameters: z.object({
        topic: z.string().describe("The topic to create an outline for"),
      }),
      execute: async ({ topic }) => {
        return { outline: `Detailed outline for: ${topic}` }
      },
    }),
    writeDraft: tool({
      description: "Write a draft based on the outline",
      parameters: z.object({
        outline: z.string().describe("The outline to write from"),
      }),
      execute: async ({ outline }) => {
        return { draft: `Draft based on: ${outline}` }
      },
    }),
    evaluateQuality: tool({
      description: "Evaluate the quality of content (1-10 scale)",
      parameters: z.object({
        content: z.string().describe("The content to evaluate"),
        type: z.enum(["outline", "draft", "final"]).describe("Type of content"),
      }),
      execute: async ({ content, type }) => {
        // Simulate quality evaluation
        const score = Math.floor(Math.random() * 4) + 7 // 7-10 range
        return { score, feedback: `Quality score for ${type}: ${score}/10` }
      },
    }),
  },
  stopWhen: stepCountIs(10), // Allow multiple steps for chaining
})
 
// Use the agent for content creation
const result = await chainingAgent.generate({
  prompt: "Create a comprehensive guide about machine learning",
})
 
console.log(result.text)
console.log(result.steps) // See all the steps taken

2) Routing — pick the right flow

First classify the request, then hand it to the best prompt or tool.

Use when: one prompt cannot cover every case. Routing keeps specialist flows simple and predictable.

Example: Routing Implementation

routing-pattern.ts
import { Experimental_Agent as Agent, tool } from "ai"
import { z } from "zod"
 
// Create specialized agents for different intents
const refundAgent = new Agent({
  model: "openai/gpt-4o",
  system: `You are a customer service specialist handling refund requests.
  
  Process:
  1. Verify the order details
  2. Check refund eligibility
  3. Process the refund if eligible
  4. Provide clear next steps`,
  tools: {
    verifyOrder: tool({
      description: "Verify order details and status",
      parameters: z.object({
        orderId: z.string().describe("The order ID to verify"),
      }),
      execute: async ({ orderId }) => {
        return { orderStatus: `Order ${orderId} verified` }
      },
    }),
    processRefund: tool({
      description: "Process a refund for an eligible order",
      parameters: z.object({
        orderId: z.string().describe("The order ID to refund"),
        amount: z.number().describe("The refund amount"),
      }),
      execute: async ({ orderId, amount }) => {
        return { refundId: `REF-${orderId}`, amount }
      },
    }),
  },
})
 
const generalAgent = new Agent({
  model: "openai/gpt-4o",
  system: `You are a helpful customer service representative.
  
  Provide friendly, accurate answers to general questions.
  If you don't know something, offer to escalate to a specialist.`,
  tools: {
    searchFAQ: tool({
      description: "Search the FAQ database",
      parameters: z.object({
        query: z.string().describe("The search query"),
      }),
      execute: async ({ query }) => {
        return { answer: `FAQ answer for: ${query}` }
      },
    }),
  },
})
 
const techSupportAgent = new Agent({
  model: "openai/gpt-4o",
  system: `You are a technical support specialist.
  
  Help customers with technical issues:
  1. Diagnose the problem
  2. Provide step-by-step solutions
  3. Escalate complex issues when needed`,
  tools: {
    diagnoseIssue: tool({
      description: "Diagnose technical issues",
      parameters: z.object({
        description: z.string().describe("Issue description"),
        systemInfo: z.string().optional().describe("System information"),
      }),
      execute: async ({ description }) => {
        return { diagnosis: `Diagnosis for: ${description}` }
      },
    }),
  },
})
 
// Router agent that classifies and delegates
const routerAgent = new Agent({
  model: "openai/gpt-4o",
  system: `You are a customer service router.
  
  Classify incoming messages as: refund, general, or tech_support
  Then delegate to the appropriate specialist agent.`,
  tools: {
    classifyIntent: tool({
      description: "Classify customer message intent",
      parameters: z.object({
        message: z.string().describe("The customer message"),
      }),
      execute: async ({ message }) => {
        // Simple classification logic
        if (message.toLowerCase().includes("refund"))
          return { intent: "refund" }
        if (
          message.toLowerCase().includes("error") ||
          message.toLowerCase().includes("bug")
        )
          return { intent: "tech_support" }
        return { intent: "general" }
      },
    }),
  },
})
 
// Usage example
async function handleCustomerMessage(message: string) {
  // First, classify the intent
  const classification = await routerAgent.generate({
    prompt: `Classify this message: ${message}`,
  })
 
  // Route to appropriate agent based on classification
  const intent = classification.text.toLowerCase()
 
  if (intent.includes("refund")) {
    return await refundAgent.generate({ prompt: message })
  } else if (intent.includes("tech")) {
    return await techSupportAgent.generate({ prompt: message })
  } else {
    return await generalAgent.generate({ prompt: message })
  }
}

3) Parallelization — divide and gather

Run several model calls at the same time, then merge.

Sectioning splits the task into parts. Each part runs in parallel.

Voting runs the same task several times, then takes the best answer.

Use when: you need more coverage (multiple sections) or higher confidence (voting). Latency can drop if you fan out inside one request.

Example: Parallel Processing

parallel-processing.ts
import { Experimental_Agent as Agent, stepCountIs, tool } from "ai"
import { z } from "zod"
 
const parallelAgent = new Agent({
  model: "openai/gpt-4o",
  system: `You are a research assistant that can work on multiple tasks simultaneously.
  
  When given a complex task:
  1. Break it down into parallel subtasks
  2. Execute multiple research streams simultaneously
  3. Synthesize results from all streams
  4. Provide a comprehensive final answer`,
  tools: {
    splitTask: tool({
      description: "Split a complex task into parallel subtasks",
      parameters: z.object({
        task: z.string().describe("The complex task to split"),
        numParts: z
          .number()
          .default(3)
          .describe("Number of parts to split into"),
      }),
      execute: async ({ task, numParts }) => {
        return {
          subtasks: Array.from(
            { length: numParts },
            (_, i) => `Subtask ${i + 1}: ${task} - Part ${i + 1}`
          ),
        }
      },
    }),
    researchTopic: tool({
      description: "Research a specific topic or subtask",
      parameters: z.object({
        topic: z.string().describe("The topic to research"),
        depth: z.enum(["shallow", "medium", "deep"]).default("medium"),
      }),
      execute: async ({ topic, depth }) => {
        return {
          findings: `Research findings for ${topic} (${depth} depth)`,
          sources: [`Source 1 for ${topic}`, `Source 2 for ${topic}`],
        }
      },
    }),
    synthesizeResults: tool({
      description:
        "Synthesize multiple research results into a coherent answer",
      parameters: z.object({
        results: z
          .array(z.string())
          .describe("Array of research results to synthesize"),
      }),
      execute: async ({ results }) => {
        return {
          synthesis: `Synthesized findings from ${results.length} research streams`,
          keyInsights: results.map((r, i) => `Insight ${i + 1}: ${r}`),
        }
      },
    }),
  },
  stopWhen: stepCountIs(15), // Allow multiple steps for parallel processing
})
 
// Usage: The agent will automatically break down tasks and process them in parallel
const result = await parallelAgent.generate({
  prompt:
    "Research the impact of AI on software development, including productivity, job market, and tooling",
})
 
console.log(result.text)
console.log(result.steps) // See all parallel research steps

4) Orchestrator → Workers — plan and assign

One model plans and delegates. Other models focus on a single subtask.

Use when: the number of subtasks is unknown or you need specialists (e.g., multi-file code edits, complex research).


5) Evaluator ↔ Optimizer — generate, then critique

Two copies of the model (or two different models) trade drafts and feedback.

  • One writes a draft.
  • The other checks it against a simple rubric.

Use when: quality must meet a checklist. Works well for structured plans, SQL, or reports.

Example: Evaluator Pattern

evaluator-pattern.ts
import { Experimental_Agent as Agent, stepCountIs, tool } from "ai"
import { z } from "zod"
 
const evaluatorAgent = new Agent({
  model: "openai/gpt-4o",
  system: `You are a quality assurance agent that generates and evaluates content.
  
  Process:
  1. Generate initial content based on the request
  2. Evaluate the content against quality criteria
  3. If quality is insufficient, regenerate with improvements
  4. Continue until quality standards are met
  
  Quality criteria:
  - Accuracy: Information must be factually correct
  - Completeness: All aspects of the request must be addressed
  - Clarity: Content must be easy to understand
  - Structure: Information must be well-organized`,
  tools: {
    generateContent: tool({
      description: "Generate content based on the given prompt",
      parameters: z.object({
        prompt: z.string().describe("The content generation prompt"),
        attempt: z
          .number()
          .default(1)
          .describe("Attempt number for regeneration"),
      }),
      execute: async ({ prompt, attempt }) => {
        return {
          content: `Generated content for: ${prompt} (Attempt ${attempt})`,
          attempt,
        }
      },
    }),
    evaluateQuality: tool({
      description: "Evaluate content quality against criteria",
      parameters: z.object({
        content: z.string().describe("The content to evaluate"),
        criteria: z.array(z.string()).describe("Quality criteria to check"),
      }),
      execute: async ({ content, criteria }) => {
        // Simulate quality evaluation
        const scores = criteria.map((criterion) => ({
          criterion,
          score: Math.floor(Math.random() * 3) + 7, // 7-9 range
          feedback: `Good performance on ${criterion}`,
        }))
 
        const overallScore =
          scores.reduce((sum, s) => sum + s.score, 0) / scores.length
 
        return {
          scores,
          overallScore: Math.round(overallScore * 10) / 10,
          meetsStandards: overallScore >= 8,
          feedback: `Overall quality: ${overallScore}/10`,
        }
      },
    }),
    improveContent: tool({
      description: "Improve content based on evaluation feedback",
      parameters: z.object({
        originalContent: z.string().describe("The original content"),
        feedback: z.string().describe("Evaluation feedback for improvement"),
        previousAttempt: z.number().describe("Previous attempt number"),
      }),
      execute: async ({ originalContent, feedback, previousAttempt }) => {
        return {
          improvedContent: `Improved version of: ${originalContent}\n\nBased on feedback: ${feedback}`,
          attempt: previousAttempt + 1,
        }
      },
    }),
  },
  stopWhen: stepCountIs(20), // Allow multiple iterations for quality improvement
})
 
// Usage: The agent will automatically generate, evaluate, and improve content
const result = await evaluatorAgent.generate({
  prompt: "Write a comprehensive guide about TypeScript best practices",
})
 
console.log(result.text)
console.log(result.steps) // See all generation and evaluation steps

Agents in Action

Agent Control Loop

This loop is the baseline. The LLM plans, acts, inspects the result, then decides what to do next. Wrap guardrails around each arrow so you can see and debug what happened.

Use when: the job has unclear length, depends on tool results, or needs back-and-forth with a user.


Combine Patterns on Purpose

You can layer these ideas. One common recipe: route → chain → evaluate. Another: an orchestrator delegates to workers that run chains and use tools. Combine only when you can explain how each extra hop improves your goal metric.


Core Principles

  1. Keep it simple. If you cannot sketch the loop on a napkin, it is too big.
  2. Make it transparent. Log plans, tool calls, and retries. Visibility builds trust.
  3. Design clean tools. Treat each tool like a public API. Document inputs, outputs, and examples.

Real-World Patterns

Customer Support Agents

Coding Agents

In both cases, humans still review the path, design the tools, and define the safety checks.


Design Tools with ACI

ACI stands for:

  • Abstraction — name the tool after the goal, not the implementation.
  • Constraints — keep inputs narrow, prefer enums, add validation.
  • Instructions — show in-context examples so the model knows success.

Additional tips:

  • Use formats the model already knows (Markdown beats escaped JSON).
  • Include positive and failure examples.
  • Add “poka-yoke” guards: redesign so bad calls are hard (e.g., demand absolute paths, block dangerous flags).

Keep the Big Picture

Success is not about chasing the flashiest agent demo. It is about building the smallest system that solves a real user problem and can survive a week of production traffic.

  • Start small, measure, then add pieces deliberately.
  • Surface the agent's thinking so you can guide it.
  • Make tools obvious and easy to call correctly.
  • Understand every line that runs in your stack.