Skip to content

Our Answer: mq9

In the previous post, "The Next Decade of Message Queues: Forget Kafka," I offered a few judgments: append-only log is not the whole of communication; the S3 and data lake narrative is where that framework runs out of road; the underlying layer must be atomic; and the evaluation framework needs to shift from "volume" to "variety."

That post only raised questions; it didn't propose solutions. This one is about our answer.


Where We Started

RobustMQ has been in development for three years. It started by solving the problem of multi-protocol fragmentation — a unified storage layer, with protocols as read/write interfaces: a message is written once, and MQTT, Kafka, NATS, and AMQP each read it according to their own semantics.

As we built it, we noticed a side effect of this architecture: adding a new protocol costs almost nothing. The storage layer stays unchanged; a protocol is just an additional semantic parsing layer.

mq9 grew from this foundation as the fifth protocol — designed specifically for AI Agent communication.


Why a Mailbox

In the previous post I said communication takes far richer forms than the append-only log — mailboxes, task queues, latest values, request-reply, temporary channels. mq9 chose the most fundamental one: the mailbox.

Why? Because Agents are not services. Services are long-running and always reachable; you send a request and they respond. Agents are ephemeral, may live for only a few seconds, and go online and offline at any time. Today Agent A sends a message to Agent B; B is offline, and the message is gone.

Every team building multi-Agent systems is using temporary workarounds — Redis pub/sub, database polling, homegrown task queues. Workable, but all workarounds.

mq9 solves it directly: I send it; you pick it up when you're free; no need to be online simultaneously. Not a message queue, not a stream — it's a mail system.

The mailbox metaphor is not rhetoric; it is the design itself:

  • Register a mailbox, get an address → MAILBOX.CREATE
  • Send mail, regardless of whether the other party is home → MAILBOX.MSG.{mail_id}.{priority}
  • Receive mail, urgent items first → subscribing delivers all non-expired messages in full
  • Reply, the letter contains the other party's address → reply_to
  • Unused for a long time, the post office automatically closes it → TTL auto-destroy, no manual deletion
  • Public mailbox, anyone can find it and send to it → public: true, registered to PUBLIC.LIST
  • A job posted to the board, only one person can take it → queue group

One Concept, Three Operations

mq9 has a single core concept: the mailbox.

The entire interface is three operations:

OperationDescription
MAILBOX.CREATECreate a mailbox
MAILBOX.MSG.{mail_id}.{priority}Send a message
MAILBOX.MSG.{mail_id}.*Subscribe to a mailbox

No QUERY — every subscription delivers all non-expired messages in full; subscribing is querying. No DELETE — pure TTL management, auto-cleanup on expiry.

mq9 only handles communication — how to reliably deliver messages. What the message contains and what the business semantics are — mq9 does not care. Message content is a byte array: not parsed, not validated, not restricted.

Private and Public Mailboxes

Mailboxes come in two kinds, differing only in discoverability.

Private mailbox — mail_id is system-generated and unguessable. Knowing the mail_id lets you send and subscribe; without it, you can't interact. No token, no ACL — mail_id unguessability is the security boundary. Used for point-to-point messaging, task result delivery.

Public mailbox — mail_id is user-defined with a meaningful name. Created with public: true, automatically registered to $mq9.AI.PUBLIC.LIST, discoverable by any Agent. Used for task queues, public channels, capability announcements.

Three Priority Levels

LevelSemanticsTypical use
highUrgent, processed firstTask interrupts, emergency commands
normalRoutine communicationTask dispatch, result delivery
lowBackground, not urgentLogs, status reports

The storage layer guarantees priority ordering; consumers need not sort themselves. An edge device that has been offline for 8 hours reconnects and receives the urgent shutdown command first, then the routine configuration update.

Public Mailbox Discovery

$mq9.AI.PUBLIC.LIST is a system-managed address maintained by the broker. Does not accept user writes. Never expires.

Subscribing delivers all current public mailboxes immediately, then streams additions and removals in real time. Subscribe once at startup and continuously sense which public channels exist in the network. No registry service needed — PUBLIC.LIST is the directory.

bash
nats sub '$mq9.AI.PUBLIC.LIST'

A Few Key Design Decisions

mail_id is not bound to Agent identity. mq9 only knows mail_id, not agent_id. One Agent can apply for different mail_ids for different tasks; leave them alone when done; TTL handles cleanup. Mailboxes are task-level, not identity-level.

Store first, then push. Messages are written to storage on arrival, then pushed to online subscribers. Online subscribers take the real-time path; offline messages wait in storage. Persistence is the default behavior, not an option.

Client-side deduplication. Every message has a unique msg_id; the server does not track consumer state; clients handle deduplication themselves.

CREATE is idempotent. Creating again silently succeeds; TTL is fixed by the first creation. No state query interface provided.


Eight Real Scenarios

One concept, three operations — what can they solve?

Async task result delivery. Main Agent creates a private mailbox; sub-Agent sends results to the main Agent's mailbox when done. No blocking required; pick it up when ready.

Global status awareness. Workers create public mailboxes and report status periodically. Main Agent discovers them via PUBLIC.LIST. TTL expiry signals death automatically.

Task broadcast and competing consumption. Create a public mailbox as a task queue; Workers use a queue group to compete; the winner sends results back to the main Agent's mailbox. Offline Workers that come back online can still receive non-expired tasks.

Anomaly alert broadcast. Create a public mailbox and publish anomaly events; subscribers respond on their own. Offline handlers that come back online can still receive non-expired alerts. Publishers don't need to know who is handling it.

Edge offline message queuing. Send to an edge Agent's mailbox (critical/urgent/normal priority); if the edge is offline, messages are persisted and wait; upon reconnection, they are received by priority — critical before urgent before normal.

Async request-reply. A sends a request to B's mailbox with reply_to and correlation_id; B processes it upon coming online and sends the response to A's mailbox. Nothing lost if offline; no blocking.

Capability registration and discovery. Agents create public mailboxes to declare capabilities at startup; auto-registered to PUBLIC.LIST. Other Agents subscribe and can sense the capability distribution across the entire network. Decentralized, zero extra services.

Human-Agent hybrid workflows. Agent sends an approval request to a human approver's mailbox; the approver sends the result back to the Agent's mailbox. Humans and Agents use exactly the same protocol; the workflow is uninterrupted.


Not Append-Only, It's a Mailbox

In the previous post I said everything in Kafka flows from the append-only log design decision. mq9 chose a different path: a mailbox model, based on the NATS protocol.

This choice means mq9 is a completely different thing from the Kafka family. Under mailbox semantics, every assumption of the append-only log becomes unnecessary: messages don't need ordered replay — they need priority-based retrieval; no offset management needed — TTL auto-cleanup instead; no partitions needed — dynamic mail_ids instead; not every message needs three-replica persistence — choose storage level by scenario.

mq9's position in the NATS ecosystem is the middle ground between Core pub/sub and JetStream. Core NATS is too lightweight — no persistence, offline means lost. JetStream is too heavy — streams, consumers, offsets, replay, a full Kafka-equivalent semantic stack. mq9 adds persistence, priority, and TTL auto-management on top of pub/sub, without introducing streams, consumer groups, or offsets. Just enough, no more.

To be clear: mq9 is not NATS — it is an independent broker that is compatible with the NATS protocol. All NATS SDKs in all languages work out of the box, with zero ecosystem cost. The $mq9.AI.* namespace design makes the protocol itself documentation — $mq9.AI.MAILBOX.MSG.mail-d7a5072lko83gp7amga0-d7a5072lko83gp7amgag.critical, one string, and the semantics are self-evident.


Atomic Underneath, Composable Above

In the previous post I spent considerable space on one judgment: the underlying layer must be atomic; the upper layer composes and wraps by scenario. mq9 is that judgment in practice.

RobustMQ's storage layer provides three atomic capabilities:

Memory — pure in-memory, lightest, no persistence. Coordination signals between Agents; gone once sent; retransmit if lost.

RocksDB — temporary persistence. Messages written to disk, waiting for the other party to come online and fetch them; TTL auto-cleanup on expiry. The default choice for mailboxes.

File Segment — long-term persistence. Audit logs, critical events — retained permanently.

A coordination signal takes the memory path; a task instruction takes RocksDB; an audit record takes File Segment. Same system, same API, different costs. Every message chooses on demand — not every message deserves three-replica persistence, and not every message can be dropped.


Summary

mq9 looks like a message queue, like pub/sub, like JetStream — it resembles something from each. But it is its own thing: a mailbox, three operations, purpose-built for Agent scenarios, and no one has done this before.

Like email is not a database, even though email systems use databases to store mail. mq9 is not a message queue, even though mq9 uses message queue capabilities to implement mailboxes.

As long as you need to run many Agents, you need mq9. One command to start, three operations, all patterns covered.

The questions have been asked. This is our answer. Still on the road, taking it one step at a time.

🎉 既然都登录了 GitHub,不如顺手给我们点个 Star 吧!⭐ 你的支持是我们最大的动力 🚀