Durable Orchestration with Pydantic AI Agents and Restate

Restate now offers native integration with Pydantic AI, bringing enterprise-grade durability to your Python agents with just a few lines of code.

Agents interact with many systems: LLMs, MCP servers, other APIs and databases, humans, session stores, and more. Each of these interactions can fail, experience network outages, hit timeouts and rate limits, or take longer than expected to respond. And also the agent itself can be part of an infrastructure failure. The same issues as with any other backend.

When you deploy your agents to production, they need to be able to handle all these scenarios: not just for simple agents with a single tool but also for long-running agents, human approval steps, and multi-agent systems. The Restate + Pydantic AI integration makes this possible.

What Restate Adds to Your Pydantic AI Agents

While Pydantic AI gives you a type-safe, Pythonic way to define agents with structured outputs, Restate gives them all they need to work well in production:

Never lose progress or duplicate work, via retries and recovery
Durable sessions, via embedded K/V store and concurrency management
Resilient human approvals that might take minutes or months
Resilient orchestration across agents and tools, and even deployment targets and time
Great support for serverless: the agent execution suspends when the agent is waiting
Support for long-running agents (from millis to months)
Detailed observability across agents and tools

Getting started

💡 Useful links:

Add the Restate SDK and Pydantic AI to your project:

uv init .
uv add restate_sdk[serde] pydantic-ai

Create your first durable agent:

import restate
from pydantic_ai import Agent, RunContext
from restate.ext.pydantic import RestateAgent, restate_context

weather_agent = Agent(
  "openai:gpt-5.2",
  system_prompt="You are a helpful agent that provides weather updates.",
)

@weather_agent.tool()
async def get_weather(_run_ctx: RunContext[None], city: str) -> dict:
  """Get the current weather for a given city."""
  return await restate_context().run_typed("fetch weather", fetch_weather_api, city=city)

restate_agent = RestateAgent(weather_agent)
agent_service = restate.Service("WeatherAgent")

@agent_service.handler()
async def run(_ctx: restate.Context, message: str) -> str:
  result = await restate_agent.run(message)
  return result.output

app = restate.app([agent_service])

That's it. Your agent is now resilient to failures, with every step logged and recoverable. For example:

Failed LLM calls will be retried
Failed calls to the weather API will be retried
If the service crashes, Restate remembers all the steps it did in a journal, and will resume the execution on another process by replaying the journal

But it's much more than only this!

Building Blocks for Reliable Agents

Durable sessions with concurrency control

Build agents that remember. Use Restate's Virtual Objects to create stateful agents keyed by user or session ID. Message history is persisted in Restate's durable K/V store and automatically restored on each request:

assistant = Agent(
  "openai:gpt-5.2",
  system_prompt="You are a helpful assistant.",
)
restate_assistant = RestateAgent(assistant)

chat = VirtualObject("Chat")

@chat.handler()
async def message(ctx: ObjectContext, req: ChatMessage) -> str:
  # Load message history from Restate's durable key-value store
  history = await ctx.get("messages", serde=MessageSerde())

  result = await restate_agent.run(req.message, message_history=history)

  # Store updated history back in Restate state
  ctx.set("messages", result.all_messages(), serde=MessageSerde())
  return result.output

The conversation state lives in Restate, queryable through the UI, with automatic concurrency control to prevent race conditions when users send multiple messages.

See the chat example

Human Approvals as durable promises

Real-world agents need human oversight. Restate's durable promises make this easy: your agent pauses execution, waits for human input, and resumes automatically, even if the service crashes while waiting:

@agent.tool
async def human_approval(_run_ctx: RunContext[None], claim: InsuranceClaim) -> str:
  """Ask for human approval for high-value claims."""

  # Create a durable promise
  approval_id, approval_promise = restate_context().awakeable(type_hint=str)

  # Notify moderator (persisted)
  await restate_context().run_typed(
      "Request review", request_human_review, claim=claim, awakeable_id=approval_id
  )

  # Wait for review - survives crashes
  return await approval_promise

Add timeouts to prevent workflows from waiting indefinitely. Restate persists both the timeout and the approval promise, maintaining the correct remaining time even through restarts:

# Wait for human approval for at most 3 hours to reach our SLA
match await restate.select(
  approval=approval_promise,
  timeout=restate_context().sleep(timedelta(hours=3)),
):
  case ["approval", approved]:
      return "Approved" if approved else "Rejected"
  case _:
      return "Approval timed out - Evaluate with AI"

Explore human-in-the-loop patterns

Multi-agent orchestration

Pydantic AI lets you define specialized agents and route between them. Restate makes these interactions durable: if any agent in the chain fails, Restate recovers the entire workflow from its journal:

medical_agent = Agent(
  "openai:gpt-5.2",
  system_prompt="Review medical claims for coverage and necessity.",
)
restate_medical_agent = RestateAgent(medical_agent)

car_agent = Agent(
  "openai:gpt-5.2",
  system_prompt="Assess car claims for liability and damage.",
)
restate_car_agent = RestateAgent(car_agent)

intake_agent = Agent(
  "openai:gpt-5.2",
  system_prompt="Route insurance claims to the appropriate specialist.",
)

@intake_agent.tool
async def consult_medical_specialist(
  _run_ctx: RunContext[None], claim: InsuranceClaim
) -> str:
  """Route to the medical specialist for medical insurance claims."""
  result = await restate_medical_agent.run(claim.model_dump_json())
  return result.output

@intake_agent.tool
async def consult_car_specialist(
  _run_ctx: RunContext[None], claim: InsuranceClaim
) -> str:
  """Route to the car specialist for car insurance claims."""
  result = await restate_car_agent.run(claim.model_dump_json())
  return result.output

You can also call agents deployed as separate services using durable RPC, or fan out to multiple agents in parallel using restate.gather():

# Durable service call to a remote agent; persisted and retried by Restate
@agent.tool
async def check_fraud(_run_ctx: RunContext[None], claim: InsuranceClaim) -> str:
  """Analyze the probability of fraud."""
  return await restate_context().service_call(run_fraud_agent, claim)

See multi-agent examples

See Everything Your Agent Does

Restate traces every action automatically. Open the Restate UI to inspect running agents, see LLM calls and tool executions in real-time, and debug issues without adding instrumentation:

The trace shows you exactly what your agent did, when it did it, and what state it held at each step. When something goes wrong, you have all the information you need to understand why.

Beyond the Basics

Restate lets you take your agents to the next step:

Reliable communication for multi-agent systems
Parallelization, racing, and other concurrency patterns, deterministic across retries
Resilient rollback for actions that should revert on failure
Idempotency for crucial actions
Each request is an ID-addressable task that you can cancel, kill, roll back, restart, attach,...
And much more...

Start Building

AI agents are moving from demos to production, and production means dealing with all sorts of adversities. Pydantic AI gives you a type-safe, Pythonic way to build agents with structured outputs and tool definitions. Restate gives you the reliability primitives to run them in production without rebuilding your entire stack.

The Restate + Pydantic AI integration is available now. You get:

✅ Automatic failure recovery for every step your agent takes
✅ Persistent conversation memory without external databases
✅ Complete execution visibility for debugging and monitoring
✅ Resilient primitives to model approval workflows with timeouts
✅ Multi-agent orchestration that is consistent across failures
✅ Deploy on any platform: Modal, AWS Lambda, or Kubernetes

You write normal Python code using the Pydantic AI patterns you already know. Restate handles the hard parts of making it production-ready.

Get started today:

✨ Star us on GitHub and join our community on Discord or Slack. We'd love to see what you build.