🖱️Control + Alt + Restate 1.5

Yesterday we announced making Restate Cloud publicly available. Today, we are announcing Restate 1.5. The release includes many changes that were important to build Restate Cloud and are now available in the open source version as well.

Restate 1.5 adds significant enhancements around observability, tools to manage function and workflow executions, and a complete documentation overhaul. There are additional quality-of-life improvements like faster SQL queries (more responsive UI) and compression for AWS Lambda.

Observability: Timelines, Details, and Histories

Over the past few releases, we've been continuously expanding the observability of durable function invocations. The UI now shows a live timeline of execution steps, including retries, nested RPC calls, and events, with awakeables, promises, and even cancellation signals.

Invocations (RPC, send) link to each other, so you can use the UI to navigate calls across your services like a distributed call stack. Virtual Objects give you insights into their queue and who owns access.

Restate 1.5 keeps by default a full history of all invocations and their progress and timelines (journals). This data is stored in the single binary’s RocksDB tables, no external system required.

You can control how long Restate should keep the history on a per deployment/service/handler level. See service configuration docs for details.

You can also decide on a per-service and handler-level to not keep histories - a pattern that we have seen is very useful for auxiliary functions or helper Virtual Objects (track state, semaphores, etc.) that may have a high load of invocations, but their histories add little value (and just consume storage).

Restarting invocations - the end of the dead-letter-queue

Sometimes, it is impossible to handle an event or invocation, due to an application-level error, denied access, inconsistent environment. In Restate, this is typically signaled by throwing a TerminalError (or TerminalException) so the runtime doesn’t retry.

A common pattern in application architectures is to put such events and invocations into a dead letter queue, a specific queue or log to collect events that need to be manually handled, or manually re-inserted into the upstream queues or services once the root problem is resolved. This is typically complex and requires a lot of ad-hoc plumbing.

Restate now retains invocations and lets you re-trigger them via the UI ("Restart as new"). This will re-trigger the invocation and create a new durable execution for it. You can use that as an alternative to dead letter queues, to re-process failed invocations once the failure cause is removed.

For complete details, see the Restate 1.5 release notes and managing invocations documentation.

Retries, Pauses, Resuming

One of the best things about durable functions and durable execution is how it guarantees to drive executions to the end - in the presence of crashes, temporary API failures, network issues, etc. That also allows users to build patterns like SAGAs, where durable execution ensures that compensation happens reliably.

However, this approach can overwhelm APIs, downstream systems, or incur unnecessary load in situations where progress is temporarily impossible due to misconfigurations or removed dependencies. So based on your feedback, we've made several improvements to give you better control over how Restate drives invocations:

Granular retry policies

You can configure retry behavior now per service or handler. That complements the retry policies that you can attach on a step.

// Add service options to the service definition
const myWorkflow = restate.workflow({
  name: "MyWorkflow",
  handlers: {
    run: async (ctx: restate.Context) => {},
  },
  options: {
    retryPolicy: {
      initialInterval: { seconds: 1 },
      maxInterval: { seconds: 30 },
      maxAttempts: 10,
      onMaxAttempts: "pause",
    }
  },
});

Check the service configuration docs for more details.

Pausing Invocations

Sometimes, a function execution can make no progress, no matter how often it retries, for example when hitting a configuration error, deterministic bug, blocked API key, or when a dependency got removed. Infinite retries don’t help here - they might even rack up unfortunate bills. But at the same time, blindly failing the invocation is not a good default either, as this might not execute compensation logic and break consistency assumptions.

In Restate 1.5, invocations can now pause instead of failing when exhausting their retries. Pausing means that they go into a state similar to suspension. The invocation keeps existing, and keeps the journal and ownership of virtual objects, but will not continue unless you explicitly tell it to - it backs off until further notice.

Compared to failing an invocation (via TerminalError or -Exception) and restarting it (dead letter queue use case), pausing is a bit different, because it holds on to the current journal on virtual object lock - it waits in the middle of the execution, while a failed invocation would typically roll back current progress and then later start again from the beginning (when restarted).

You can list and resume paused invocations via UI or CLI.

Restate Cloud is currently configured to pause invocations after 20 retries, to avoid creating excessive bills with services deployed on FaaS. This is a new feature and we would love your feedback on how this works for you, and whether we should adjust the defaults - let us know on Discord, Slack, or send us an email to cloud@restate.dev

Moving invocations to different deployments

When resuming a paused invocation, you can target a different service deployment. This lets you move an invocation to a new deployment without losing the progress.

This is helpful when you use deployment versioning (which is an awesome way to avoid breaking workflows on code changes) and you need to move invocations explicitly to a different deployment, because of a bug in the original deployment’s application code.

Faster queries, snappier UI

Restate’s observability is based on an embedded SQL engine that analyzes the data stored in the RocksDB tables and in other internal data structures.

We made a series of performance improvements to that engine, and many queries now run 5x to 20x faster, making the UI/CLI experience way more snappier:

AWS Lambda: Rust support & payload compression

AWS Lambda imposes a size limit of 6 MB on its body, and throws a PAYLOAD_TOO_LARGE error if that limit is exceeded. With long-running handlers that keep large journals, it is quite possible to reach that limit.

With Restate 1.5 and the latest TypeScript SDK, payloads are automatically compressed when approaching the size limit, pushing the size of journals and payloads you can use with AWS Lambda much further.

The Rust SDK also now supports AWS Lambda.

Full Documentation Overhaul

We've restructured the documentation to improve the getting started experience and better explain core concepts. Here are some of the key additions.

New tutorials for common use cases

Tour of Agents: An introduction to building AI agents with Restate, covering Vercel AI SDK integration, observability, human-in-the-loop patterns, chat, subworkflows, parallelization, and multi-agent systems. Next up: Python OpenAI Agents and Pydantic AI.
Tour of Workflows: A guide for developers familiar with workflow orchestration tools, focused on implementing workflows with signals, queries, and error handling.
Tour of Microservices Orchestration: Covers building backends and microservices with Restate: sagas, communication patterns, idempotency, state management, and concurrent tasks.

AI chat and docs MCP server

Ask questions using the text box at the bottom of the docs, or open the AI assistant.

For Claude/Cursor integration, add the docs as an MCP server:

claude mcp add --transport http docs https://docs.restate.dev/mcp

We've also updated use case pages and added coverage of recent SDK features. Visit docs.restate.dev.

Build with Restate

The fastest way to get started is with Restate Cloud, try it for free and have a managed instance running in minutes. Or follow the quickstart guide to run Restate locally.

Questions about the upgrade path or feedback on these features? Join us on Discord or Slack.

Restate is open, free, and available at GitHub or at the Restate website. Star the GitHub project if you like what we are doing!

Full changelog

For the full list of changes, and upgrade notes: