Skip to content

Three Storage Modes in RobustMQ

The storage engine is the heart of a message queue. It determines performance limits, cost floor, and which scenarios the system can serve. When designing RobustMQ’s storage layer, I spent a long time on one question: How can one architecture support Kafka-style high throughput, massive topic counts, and NATS-style extreme performance?

After studying various industry solutions, my answer was: don’t solve everything with one engine—offer three storage engines and choose automatically based on scenario. This post shares the design and implementation considerations for RobustMQ’s storage layer.

Design Considerations and Scenario Selection

Several considerations drive this design.

RobustMQ’s goal is "all in one." The upper layer already supports multiple protocols, so the core part—the storage layer—must cover scenarios as completely as possible. Message queues face varied needs: extreme performance, huge numbers of topics, high throughput, low cost. A single storage engine cannot optimize all of these. From an engineering standpoint, I provide three engines with shared abstractions, each tuned for specific scenarios, so RobustMQ can serve a wide range of use cases.

Architecture must be extensible. Even if only one storage is used initially, the design should allow future extensions. A unified abstraction, clear layering, and shared ISR mechanism ensure that adding new engines or features won’t require large rewrites.

Feasibility matters too. Three engines are not built all at once but validated in stages. Each stage has clear goals and verifiable outcomes. That keeps direction correct and avoids over-engineering.

Cost and performance need balance. Not all data gets the highest protection (multi-replica file segments). By data characteristics: temporary data uses memory (lowest cost), normal data uses RocksDB single replica (balance of performance and cost), critical data uses file segments with multiple replicas (highest reliability). Fine-grained resource allocation keeps overall cost lower.

Engine selection is not manual configuration but automatic based on protocol and config. For MQTT, storage type (Memory or RocksDB) depends on whether persistence is required. MQTT’s traits are many topics, offline messages, and possibly small per-topic volume. For massive topic scenarios, RocksDB is the default (avoids file explosion). For high throughput with few topics, file segments (better performance). For Kafka protocol, file segments are the default because Kafka’s semantics are persistent logs. Users can override for specific topics: memory (e.g., temporary metrics) or RocksDB (small per-topic volume but many topics).

This automatic selection is driven by protocol characteristics, not complex AI analysis. It’s simple, predictable, and debuggable. Users can always override via config when needed.

The Three Storage Engines

RobustMQ’s storage layer supports three engines: Memory, RocksDB, and File Segment. Each targets different scenario characteristics.

Memory is pure in-memory with no persistence. Messages are written and read from memory. Latency drops to microseconds; throughput can reach tens of millions of QPS. The tradeoff is no persistence—restart or node failure loses data.

Memory fits scenarios where data can be lost: MQTT QoS 0 device heartbeats, real-time monitoring metrics, temporary service notifications. This data’s value is current state; history doesn’t matter. Memory storage gives best performance and lowest cost.

RocksDB addresses the "too many topics" problem. When there are tens of thousands or millions of topics but each topic is small, the classic "one topic, one set of files" approach explodes the file system. RocksDB stores messages from all topics in a unified KV store, distinguished by keys (topic_id:offset). Under the hood it’s an LSM-Tree; regardless of topic count, the physical files are a fixed set of SSTs.

Typical RocksDB scenarios: IoT device management and multi-tenant SaaS. One topic per device—a million devices mean a million topics. One topic per tenant—hundreds of thousands of tenants mean hundreds of thousands of topics. In these cases, most topics have small volumes—maybe a few to dozens of messages per second. File segments would create huge file counts; RocksDB keeps file count bounded, and LSM-Tree handles small concurrent writes well.

File Segment is for high-throughput scenarios. Messages are written in offset order into fixed-size segment files (e.g., 1GB); when full, a new segment is created. This append-only design maximizes sequential write performance; a single node can support hundreds of thousands of QPS. Segment files can also be used directly for data lake integration, e.g., converted to Parquet for analytics.

File Segment fits classic Kafka use cases: log collection, user behavior tracking, CDC, real-time data pipelines. These have large per-topic volumes—tens of thousands to hundreds of thousands of messages per second—but fewer topics (tens to hundreds). Sequential write and read; file segments perform best.

Unified Abstraction and Layered Design

The three engines differ in implementation but expose a unified interface. This abstraction is central to the design.

The storage engine interface is simple: append writes messages and returns offset; fetch reads by offset; current_offset returns the latest position. Protocol and replication layers call this interface without knowing which engine is underneath.

This abstraction enables flexibility. MQTT and Kafka protocol layers share the same logic—calls to the unified interface. Switching storage engines doesn’t require protocol changes. Replication is generic: it pulls from the Leader and writes to Followers through this interface, regardless of storage.

Layering also allows independent optimization. Storage engines focus on performance (faster write, faster read); replication focuses on reliability (no data loss, fast recovery); protocol layer focuses on compatibility (full Kafka semantics). Each layer has clear responsibilities.

Universality of Replication and Storage Switching

The ISR replication mechanism is shared across all three engines. That’s a design highlight.

Users configure topic replica count; the system decides whether to enable ISR. Single-replica topics skip ISR; data is written and read locally. Multi-replica topics use ISR: Leader writes, Followers sync from Leader.

This applies identically to all three engines. Two replicas with memory storage: Leader writes to memory, Follower pulls and writes to its own memory. Three replicas with file segments: Leader writes to local files, Followers pull and write to theirs. Five replicas with RocksDB: same logic.

ISR is implemented once and reused for all storage engines. That reduces complexity and simplifies testing. ISR correctness and storage engine behavior can be tested separately, then combined.

When a topic’s data characteristics change, switching storage may be needed. E.g., a topic starts small with RocksDB; as it grows to tens of thousands of messages per second, it should switch to file segments.

We handle this without data migration—instead, we aggregate on read. When switching, new data goes to the new engine; old data stays in the old one. On read, the system reads from both and returns results sorted by offset to the client. After old data passes retention (e.g., 7 days), it’s deleted.

This avoids migration complexity: no background migration job, no consistency issues during migration, no rollback worries. We use the natural expiration of message queue data to turn migration into "wait until expiry." Simple, reliable, elegant.

Current Progress and Summary

The basic framework for all three engines is done. Memory, RocksDB, and File Segment implement the unified storage interface; the protocol layer can use them transparently. Via config or protocol traits, the system chooses the right engine. The aggregation mechanism for storage switching has been validated.

ISR replication is the next focus. The design already supports a shared ISR for all three engines, but implementation hasn’t started. That includes Leader election, Follower sync, fault recovery, and other core distributed replication logic. Once replication is solid, we can offer production-grade HA.

This staged path validates the architecture while keeping engineering feasible. Storage engine logic is in place; next we focus on distributed consistency and build a solid foundation step by step.

RobustMQ’s storage layer uses three engines (memory, RocksDB, file segment), a unified abstraction, and shared ISR. This covers scenarios from extreme performance to massive topics to high throughput while keeping the design simple and extensible.

Storage is selected automatically by protocol; users can override. Replica count is configurable; ISR applies to all engines. Storage switching uses aggregation on read, avoiding migration complexity.

This isn’t theoretical innovation but engineering combination. We combine existing storage ideas (sequential files, LSM-Tree, memory) in a unified architecture, managed by clear abstractions and sensible policies. That’s how foundational software innovates in maturity: not new theory, but engineering done well.

Implementation is phased. The three engines’ basic framework is done; replication is next. The architecture looks ahead, but implementation stays pragmatic—extensibility for the long term, without over-design.

Strong foundations make the rest possible. That’s the core idea behind RobustMQ’s storage design.