Karakeep: Building a scalable bookmark manager with Restate

Karakeep is an open source self-hostable bookmark manager that uses AI to automatically tag and organize bookmarks for faster retrieval. When launching a hosted version of Karakeep, they replaced their SQLite-based job queue with Restate because of its ease of integration and support for complex job orchestration.

Integrating Restate with Karakeep took around a day of work and it has been running rock-solid in our production environment.

-- Mohamed Bassem - Main engineer
  • Github: karakeep-app/karakeep 20k+ stars
  • Industry: Developer Tools / Open Source
  • Use Case: Async Job Processing
  • Tech Stack: TypeScript, SQLite

Current Architecture & Challenges

Karakeep focuses on the large self-hosting community. As such, it’s optimized for ease of installation, minimal number of dependencies and hands-free upgrades. Self-hosted deployments tend to be small, so Karakeep mainly focuses on single-node deployments and uses SQLite for the database and queues.

When a bookmark is added to Karakeep, a set of async jobs are enqueued, such as crawling, tagging, and indexing. Karakeep also has some background processes that run on a schedule, like RSS feed fetching. Those async jobs run on different processes, called Karakeep workers. The queuing system between the web UI and the workers is an in-house built SQLite-based job queue called liteque.

Karakeep architecture

When Karakeep wanted to launch a hosted version of their product, they needed to revisit some of their design choices while still maintaining their hands-free upgrades to existing users. They needed to:

  • Decouple web and worker processes across separate hosts
  • Replace the SQLite-based queue system for distributed processing
  • Maintain backward compatibility for existing self-hosted users

Why Restate?

For their cloud deployment, Karakeep needed to replace their SQLite-based queue with a distributed solution. Restate's flexible programming model allowed them to implement it as a plugin using Karakeep's Queue interface, allowing self-hosted users to continue using liteque while cloud deployments connect to Restate.

Restate's Virtual Objects provided an elegant way to model complex job orchestration. Priority-based semaphores and rate limiting (per-domain and per-user) could be implemented as Virtual Objects, naturally handling concurrency limits across different job types.

A single Docker Compose setup gave them a production-ready server with built-in admin UI and powerful introspection APIs. The entire integration took around a day of work, simply wrapping existing job logic in Restate's Context::run calls.

Running a production-ready restate server with a single docker compose up was a breeze.

-- Mohamed Bassem - Main engineer

The Results

Architecture Implementation:

  • Each background job type (crawling, tagging, indexing) modeled as a Restate service
  • Priority-based semaphore and rate limiting (per-domain, per-user) implemented as a Restate Virtual Object
  • Karakeep's admin dashboard uses Restate's introspection API to show queue statistics