Durable Orchestration with Pydantic AI Agents and Restate
Restate Team
Restate now offers native integration with Pydantic AI, bringing enterprise-grade durability to your Python agents with just a few lines of code.
Agents interact with many systems: LLMs, MCP servers, other APIs and databases, humans, session stores, and more. Each of these interactions can fail, experience network outages, hit timeouts and rate limits, or take longer than expected to respond. And also the agent itself can be part of an infrastructure failure. The same issues as with any other backend.
When you deploy your agents to production, they need to be able to handle all these scenarios: not just for simple agents with a single tool but also for long-running agents, human approval steps, and multi-agent systems. The Restate + Pydantic AI integration makes this possible.
What Restate Adds to Your Pydantic AI Agents
While Pydantic AI gives you a type-safe, Pythonic way to define agents with structured outputs, Restate gives them all they need to work well in production:
- Never lose progress or duplicate work, via retries and recovery
- Durable sessions, via embedded K/V store and concurrency management
- Resilient human approvals that might take minutes or months
- Resilient orchestration across agents and tools, and even deployment targets and time
- Great support for serverless: the agent execution suspends when the agent is waiting
- Support for long-running agents (from millis to months)
- Detailed observability across agents and tools
Getting started
💡 Useful links:
- 🚀 Quickstart
- 🎓 Getting Started Guides
- 📦 Restate SDK on PyPi
- 📕 Restate Documentation
Add the Restate SDK and Pydantic AI to your project:
uv init .
uv add restate_sdk[serde] pydantic-aiCreate your first durable agent:
import restate
from pydantic_ai import Agent, RunContext
from restate.ext.pydantic import RestateAgent, restate_context
weather_agent = Agent(
"openai:gpt-5.2",
system_prompt="You are a helpful agent that provides weather updates.",
)
@weather_agent.tool()
async def get_weather(_run_ctx: RunContext[None], city: str) -> dict:
"""Get the current weather for a given city."""
return await restate_context().run_typed("fetch weather", fetch_weather_api, city=city)
restate_agent = RestateAgent(weather_agent)
agent_service = restate.Service("WeatherAgent")
@agent_service.handler()
async def run(_ctx: restate.Context, message: str) -> str:
result = await restate_agent.run(message)
return result.output
app = restate.app([agent_service])That's it. Your agent is now resilient to failures, with every step logged and recoverable. For example:
- Failed LLM calls will be retried
- Failed calls to the weather API will be retried
- If the service crashes, Restate remembers all the steps it did in a journal, and will resume the execution on another process by replaying the journal
But it's much more than only this!
Building Blocks for Reliable Agents
Durable sessions with concurrency control
Build agents that remember. Use Restate's Virtual Objects to create stateful agents keyed by user or session ID. Message history is persisted in Restate's durable K/V store and automatically restored on each request:
assistant = Agent(
"openai:gpt-5.2",
system_prompt="You are a helpful assistant.",
)
restate_assistant = RestateAgent(assistant)
chat = VirtualObject("Chat")
@chat.handler()
async def message(ctx: ObjectContext, req: ChatMessage) -> str:
# Load message history from Restate's durable key-value store
history = await ctx.get("messages", serde=MessageSerde())
result = await restate_agent.run(req.message, message_history=history)
# Store updated history back in Restate state
ctx.set("messages", result.all_messages(), serde=MessageSerde())
return result.outputThe conversation state lives in Restate, queryable through the UI, with automatic concurrency control to prevent race conditions when users send multiple messages.
Human Approvals as durable promises
Real-world agents need human oversight. Restate's durable promises make this easy: your agent pauses execution, waits for human input, and resumes automatically, even if the service crashes while waiting:
@agent.tool
async def human_approval(_run_ctx: RunContext[None], claim: InsuranceClaim) -> str:
"""Ask for human approval for high-value claims."""
# Create a durable promise
approval_id, approval_promise = restate_context().awakeable(type_hint=str)
# Notify moderator (persisted)
await restate_context().run_typed(
"Request review", request_human_review, claim=claim, awakeable_id=approval_id
)
# Wait for review - survives crashes
return await approval_promiseAdd timeouts to prevent workflows from waiting indefinitely. Restate persists both the timeout and the approval promise, maintaining the correct remaining time even through restarts:
# Wait for human approval for at most 3 hours to reach our SLA
match await restate.select(
approval=approval_promise,
timeout=restate_context().sleep(timedelta(hours=3)),
):
case ["approval", approved]:
return "Approved" if approved else "Rejected"
case _:
return "Approval timed out - Evaluate with AI"Explore human-in-the-loop patterns
Multi-agent orchestration
Pydantic AI lets you define specialized agents and route between them. Restate makes these interactions durable: if any agent in the chain fails, Restate recovers the entire workflow from its journal:
medical_agent = Agent(
"openai:gpt-5.2",
system_prompt="Review medical claims for coverage and necessity.",
)
restate_medical_agent = RestateAgent(medical_agent)
car_agent = Agent(
"openai:gpt-5.2",
system_prompt="Assess car claims for liability and damage.",
)
restate_car_agent = RestateAgent(car_agent)
intake_agent = Agent(
"openai:gpt-5.2",
system_prompt="Route insurance claims to the appropriate specialist.",
)
@intake_agent.tool
async def consult_medical_specialist(
_run_ctx: RunContext[None], claim: InsuranceClaim
) -> str:
"""Route to the medical specialist for medical insurance claims."""
result = await restate_medical_agent.run(claim.model_dump_json())
return result.output
@intake_agent.tool
async def consult_car_specialist(
_run_ctx: RunContext[None], claim: InsuranceClaim
) -> str:
"""Route to the car specialist for car insurance claims."""
result = await restate_car_agent.run(claim.model_dump_json())
return result.outputYou can also call agents deployed as separate services using durable RPC, or fan out to multiple agents in parallel using restate.gather():
# Durable service call to a remote agent; persisted and retried by Restate
@agent.tool
async def check_fraud(_run_ctx: RunContext[None], claim: InsuranceClaim) -> str:
"""Analyze the probability of fraud."""
return await restate_context().service_call(run_fraud_agent, claim)See Everything Your Agent Does
Restate traces every action automatically. Open the Restate UI to inspect running agents, see LLM calls and tool executions in real-time, and debug issues without adding instrumentation:
The trace shows you exactly what your agent did, when it did it, and what state it held at each step. When something goes wrong, you have all the information you need to understand why.
Beyond the Basics
Restate lets you take your agents to the next step:
- Reliable communication for multi-agent systems
- Parallelization, racing, and other concurrency patterns, deterministic across retries
- Resilient rollback for actions that should revert on failure
- Idempotency for crucial actions
- Each request is an ID-addressable task that you can cancel, kill, roll back, restart, attach,...
- And much more...
Start Building
AI agents are moving from demos to production, and production means dealing with all sorts of adversities. Pydantic AI gives you a type-safe, Pythonic way to build agents with structured outputs and tool definitions. Restate gives you the reliability primitives to run them in production without rebuilding your entire stack.
The Restate + Pydantic AI integration is available now. You get:
- ✅ Automatic failure recovery for every step your agent takes
- ✅ Persistent conversation memory without external databases
- ✅ Complete execution visibility for debugging and monitoring
- ✅ Resilient primitives to model approval workflows with timeouts
- ✅ Multi-agent orchestration that is consistent across failures
- ✅ Deploy on any platform: Modal, AWS Lambda, or Kubernetes
You write normal Python code using the Pydantic AI patterns you already know. Restate handles the hard parts of making it production-ready.
Get started today:
- 🚀 Follow the quickstart
- 🎓 Take the interactive tour
- 💻 Browse code examples
✨ Star us on GitHub and join our community on Discord or Slack. We'd love to see what you build.