Karakeep: Building a scalable bookmark manager with Restate
Karakeep is an open source self-hostable bookmark manager that uses AI to automatically tag and organize bookmarks for faster retrieval. When launching a hosted version of Karakeep, they replaced their SQLite-based job queue with Restate because of its ease of integration and support for complex job orchestration.
Integrating Restate with Karakeep took around a day of work and it has been running rock-solid in our production environment.
-- Mohamed Bassem - Main engineer
- Github: karakeep-app/karakeep 20k+ stars
- Industry: Developer Tools / Open Source
- Use Case: Async Job Processing
- Tech Stack: TypeScript, SQLite
Current Architecture & Challenges
Karakeep focuses on the large self-hosting community. As such, it’s optimized for ease of installation, minimal number of dependencies and hands-free upgrades. Self-hosted deployments tend to be small, so Karakeep mainly focuses on single-node deployments and uses SQLite for the database and queues.
When a bookmark is added to Karakeep, a set of async jobs are enqueued, such as crawling, tagging, and indexing. Karakeep also has some background processes that run on a schedule, like RSS feed fetching. Those async jobs run on different processes, called Karakeep workers. The queuing system between the web UI and the workers is an in-house built SQLite-based job queue called liteque.
When Karakeep wanted to launch a hosted version of their product, they needed to revisit some of their design choices while still maintaining their hands-free upgrades to existing users. They needed to:
- Decouple web and worker processes across separate hosts
- Replace the SQLite-based queue system for distributed processing
- Maintain backward compatibility for existing self-hosted users
Why Restate?
For their cloud deployment, Karakeep needed to replace their SQLite-based queue with a distributed solution. Restate's flexible programming model allowed them to implement it as a plugin using Karakeep's Queue interface, allowing self-hosted users to continue using liteque while cloud deployments connect to Restate.
Restate's Virtual Objects provided an elegant way to model complex job orchestration. Priority-based semaphores and rate limiting (per-domain and per-user) could be implemented as Virtual Objects, naturally handling concurrency limits across different job types.
A single Docker Compose setup gave them a production-ready server with built-in admin UI and powerful introspection APIs. The entire integration took around a day of work, simply wrapping existing job logic in Restate's Context::run calls.
Running a production-ready restate server with a single
-- Mohamed Bassem - Main engineerdocker compose upwas a breeze.
The Results
Architecture Implementation:
- Each background job type (crawling, tagging, indexing) modeled as a Restate service
- Priority-based semaphore and rate limiting (per-domain, per-user) implemented as a Restate Virtual Object
- Karakeep's admin dashboard uses Restate's introspection API to show queue statistics