mq9 vs Existing Solutions: Eight Scenarios Side by Side
There is no standard answer today for communication between Agents. Everyone is making do with whatever is at hand — HTTP, Redis, Kafka, custom queues. They work, but every scenario requires bolting something on.
This post takes the eight core scenarios mq9 is designed to solve and compares them one by one: how existing solutions handle them, where the pain points are, how mq9 handles them, and how much better it is. Straight facts, no spin.
Scenario 1: Sub-Agent Completes a Task, Asynchronously Notifies the Primary Agent
HTTP: The sub-Agent calls back the primary Agent's webhook. The primary Agent must be online and expose an endpoint. A restart or network blip drops the callback — you need to implement retries, idempotency, and timeout handling yourself. Every project writes this from scratch.
Redis: Sub-Agent publishes to a channel; primary Agent subscribes. If the primary Agent is offline at that exact moment, the message is gone. To get reliable delivery you need to add Redis Streams or a List yourself — essentially hand-rolling a message queue on top of Redis.
Kafka: Create a topic; sub-Agent produces; primary Agent consumes. It works, but you need to pre-create the topic, configure partitions, configure retention, and manage consumer groups and offsets. A simple task callback has pulled in an entire Kafka operations stack.
mq9: Primary Agent CREATEs a mailbox and gets a mail_id; sub-Agent PUBs to $mq9.AI.INBOX.{mail_id}.normal. If online, received immediately; if offline, the message is stored and waits. Worried about missing something? QUERY as a fallback.
Scenario 2: Primary Agent Monitors All Sub-Agent Status
HTTP polling: Primary Agent periodically requests each sub-Agent's /health endpoint. Requires maintaining an Agent list — register on add, clean up on death. n Agents means n requests, O(n) overhead. Long intervals mean slow detection; short intervals waste resources.
Redis: Each Agent writes a key with TTL; primary Agent uses SCAN to iterate. O(n) operation — poor performance at scale. Redis has no reliable native ability to subscribe to key changes; you can only poll.
Kafka: Each Agent writes its status to a compacted topic. Works, but cleanup isn't real-time — there's a lag. Every write is a persistent log entry; paying the full persistence cost just for "latest status."
mq9: Worker requests a type=latest status mailbox and periodically PUBs its status to its own INBOX — only the most recent message is kept. Primary Agent subscribes to get global awareness. New Agents joining are detected automatically; exceeding the TTL triggers automatic death detection. Zero maintenance.
Scenario 3: Task Broadcast, Multiple Workers Competing to Consume
HTTP: Primary Agent implements load balancing itself, maintains a Worker list, picks one and sends a request. If a Worker goes down, retry with another. Write your own scheduler, handle your own failover.
Redis: Primary Agent LPUSH to a List; Workers BRPOP to compete. The most common approach, but there's no message acknowledgment — if a Worker takes something via BRPOP and then crashes, the task is lost. You need to build a reliable queue yourself. No priority, no broadcast capability.
Kafka: Publish to a topic; multiple consumers in the same group compete. Works, but partition count determines maximum parallelism; scaling Workers dynamically triggers rebalancing; rebalancing causes consumption pauses.
mq9: PUB $mq9.AI.BROADCAST.task.available; Workers use queue group to compete — only one claims each task. If a Worker crashes, the message is automatically redelivered. Adding or removing Workers dynamically requires zero configuration, no rebalancing. Results go back to the primary Agent's INBOX.
Scenario 4: Agent Detects an Anomaly, Broadcasts an Alert
HTTP: The Agent that detected the anomaly needs to know who has handling capability, maintain a subscriber list, and POST to each. Maintaining that list is itself a distributed problem.
Redis: PUBLISH to a channel; all subscribers receive it. Works, but offline subscribers don't get it — no persistence. If the handler happens to restart at the moment the alert fires, the alert is lost.
Kafka: Publish to a topic; each handler uses an independent consumer group. Can achieve one-to-many delivery, but each handler needs a consumer group configured and the topic must be pre-created.
mq9: PUB $mq9.AI.BROADCAST.{domain}.anomaly. All Agents subscribed to $mq9.AI.BROADCAST.*.anomaly receive it. The publisher doesn't need to know who is listening. For critical alerts, add persist=true — offline Agents receive it when they come back online.
Scenario 5: Cloud Sends Instructions to an Offline Edge Agent
HTTP: Direct failure. The edge is offline; the request times out. You need to build a retry queue, exponential backoff, and persistence for unsent instructions — essentially hand-rolling a message queue on the cloud side.
Redis: Write to a List and wait for the edge to come online and consume. But there's no priority — urgent and ordinary instructions queue together. Adding priority means using a Sorted Set, increasing complexity. Redis is an in-memory database; large backlogs are expensive.
Kafka: Write to a topic; edge consumes after coming online. Works, but no native priority — within a partition it's strict FIFO. Urgent instructions need a separate topic; the consumer has to subscribe to multiple topics and do its own priority scheduling.
mq9: Send to INBOX.{edge_mail_id}.urgent and INBOX.{edge_mail_id}.normal with separate priorities. After the edge comes back online, urgent is processed first, normal second. QUERY fallback ensures nothing is missed. RocksDB persistence, far cheaper than Redis in-memory storage.
Scenario 6: Human-Machine Hybrid Workflow
HTTP + frontend: Agent calls an approval service API to create an approval request; the frontend polls or uses WebSocket to push to the approver; after the approver acts, a callback goes back to the Agent. At least three systems involved: Agent → approval service → frontend → approval service → Agent.
Redis: Agent writes the approval request to a key; the approver's client polls; after approving, writes the result to another key; the Agent polls for the result. Both ends polling, high latency, logic scattered.
Kafka: Approval request to one topic; approval result to another. Works, but the approver needs a Kafka consumer client — unfriendly for non-technical users.
mq9: Agent PUBs to INBOX.{approver_mail_id}.urgent; the approver subscribes to their own INBOX and receives the request; after approving, PUBs back to the Agent's INBOX. Humans and Agents use exactly the same protocol — no separate approval service required. correlation_id ensures matching.
Scenario 7: Agent A Asks Agent B a Question; B May Be Offline
HTTP: If B is offline, direct failure. For async, you need to introduce callback URLs, correlation_ids, timeout/retry, and state machines — complexity explodes.
Redis: A writes the request to B's List; B writes the result to A's List after processing. Two Lists, both sides polling, manage correlation_id matching yourself. Workable, but not small in code.
Kafka: A sends a request to a request topic with a reply_to header; B consumes, processes, and sends to a reply topic. Two topics, two consumer groups; A also has to filter its own responses from the reply topic by correlation_id.
mq9: A PUBs to INBOX.{B_mail_id}.normal with reply_to and correlation_id in the payload. B comes online, processes, and PUBs back to INBOX.{A_mail_id}.normal. NATS native reply-to supported natively. If B is offline, the request waits in the mailbox — nothing is lost.
Scenario 8: Agent Registers Capabilities; Other Agents Discover It
HTTP + registry: Requires a centralized service registry (Consul, Eureka, custom-built). Agents register on start, deregister on death or rely on health checks to get removed. Introduces an additional stateful service that itself needs high-availability deployment.
Redis: Agent writes capabilities to a Hash or Set with TTL; queriers use SCAN or SMEMBERS. Works, but there's no mechanism to subscribe to capability changes — only polling. Performance degrades with SCAN as Agent count grows.
Kafka: Capability declarations written to a compacted topic; queriers consume the entire topic to build a local view. High latency, complex implementation — pulling in a full Kafka consumption pipeline for simple service discovery.
mq9: Agent requests a type=latest mailbox; capability declaration is PUBbed to its own INBOX. Other Agents subscribe to detect, or actively query via BROADCAST.capability.query — Agents with the capability respond to the discoverer's INBOX. Decentralized, no registry needed; Agent death is cleaned up by TTL automatically.
Summary
| Scenario | HTTP | Redis | Kafka | mq9 | mq9 Advantage |
|---|---|---|---|---|---|
| Sub-Agent async task return | Requires other party online; build your own retry | No persistence; hand-roll reliable queue | Works; high ops cost | CREATE + INBOX + QUERY fallback | Medium |
| Primary Agent monitors all sub-Agent status | O(n) polling | SCAN polling | Compacted topic; high latency | latest mailbox, real-time subscription, TTL auto-expiry | High |
| Task broadcast, competitive consumption | Build your own scheduler | BRPOP unreliable | Rebalancing causes pauses | BROADCAST + queue group, zero config | Low |
| Agent detects anomaly, broadcasts alert | Maintain subscriber list | No persistence | Requires multiple consumer groups | BROADCAST + optional persistence | Medium |
| Cloud sends instructions to offline edge Agent | Direct failure | No priority; high memory cost | No native priority | INBOX three-level priority + QUERY fallback | High |
| Agent sends approval request; human approves and flow continues | At least three systems | Both ends polling | Approver needs Kafka client | Humans and Agents on the same protocol; INBOX bidirectional | High |
| Agent A asks Agent B; B may be offline | Fails immediately if offline | Two Lists + both ends polling | Two topics + filter responses | INBOX + native reply-to; offline safe | Medium |
| Agent registers capabilities; others discover and call | Requires registry | SCAN polling | Compacted topic; complex | latest mailbox + BROADCAST query; zero extra service | High |
Eight scenarios: 4 high-advantage, 3 medium-advantage, 1 low-advantage. mq9's core value isn't crushing on any single dimension — it's that no scenario requires a workaround. Four commands cover it all directly, zero extra logic.
It should be noted that mq9 is in an early stage; its ecosystem maturity is nowhere near any of the solutions above. If your scenario is big data pipelines and high-throughput stream processing, Kafka is still the best choice. The comparisons above are specific to the Agent async communication use case.
Mailbox Behavior Mapping
Real-world mailbox behavior mapped to mq9's four commands:
| Real-world mailbox | mq9 | Notes |
|---|---|---|
| Open a mailbox | MAILBOX.CREATE | Get a mail_id; choose standard (accumulating) or latest (keep only the most recent) |
| Mail a letter | INBOX.{mail_id}.{priority} | Doesn't matter if the other party is home; priority: urgent / normal / notify |
| Collect mail | SUB $mq9.AI.INBOX.{mail_id}.* | Server pushes when online; urgent first |
| Go to the post office to check | MAILBOX.QUERY.{mail_id} | Fallback pull; make sure nothing was missed |
| Reply | NATS native reply-to | The letter carries the sender's return address |
| Mailbox expires | TTL auto-destruction | Nobody uses it for a long time; the post office auto-cancels it |
| Community broadcast | BROADCAST.{domain}.{event} | Bulletin board; post it and everyone can see it |
| Only read certain announcements | Wildcard subscription | $mq9.AI.BROADCAST.*.anomaly |
| Claim a job | queue group | A task posted on the board; only one person can take it |
Java Code Examples
Eight scenarios, eight functions, all using the NATS native Java SDK, zero custom wrappers.
// Dependency: io.nats:jnats:2.20.5
import io.nats.client.*;
import java.time.Duration;
public class Mq9Scenarios {
// ============================================================
// Scenario 1: Sub-Agent completes a task, asynchronously notifies primary Agent
// ============================================================
static void scenario1_AsyncTaskReturn(Connection nc) throws Exception {
// Primary Agent: subscribe to its own mailbox
Dispatcher mainAgent = nc.createDispatcher((msg) -> {
System.out.println("[Primary Agent] Received task result: " + new String(msg.getData()));
});
mainAgent.subscribe("$mq9.AI.INBOX.main-agent.*");
// Sub-Agent: task complete, send result to primary Agent mailbox
nc.publish("$mq9.AI.INBOX.main-agent.normal",
"{\"from\":\"worker-001\",\"task_id\":\"t-001\",\"result\":\"Analysis complete, anomaly rate 2.3%\"}".getBytes());
}
// ============================================================
// Scenario 2: Primary Agent monitors all sub-Agent status
// ============================================================
static void scenario2_GlobalStatusAwareness(Connection nc) throws Exception {
// Primary Agent: subscribe to all Worker status mailboxes
Dispatcher mainAgent = nc.createDispatcher((msg) -> {
System.out.println("[Status Awareness] " + msg.getSubject() + " -> " + new String(msg.getData()));
});
// Subscribe to all Worker status mailboxes (prefix convention: status-)
mainAgent.subscribe("$mq9.AI.INBOX.status-*.*");
// Workers each report their status to their own latest mailbox
nc.publish("$mq9.AI.INBOX.status-worker-001.normal",
"{\"status\":\"running\",\"load\":0.3,\"capabilities\":[\"data.analysis\"]}".getBytes());
nc.publish("$mq9.AI.INBOX.status-worker-002.normal",
"{\"status\":\"idle\",\"load\":0.0,\"capabilities\":[\"text.generation\"]}".getBytes());
}
// ============================================================
// Scenario 3: Task broadcast, multiple Workers competing to consume
// ============================================================
static void scenario3_TaskBroadcastCompete(Connection nc) throws Exception {
// Three Workers join the same queue group and compete to consume
for (int i = 1; i <= 3; i++) {
final int workerId = i;
Dispatcher worker = nc.createDispatcher((msg) -> {
System.out.println("[Worker-" + workerId + "] Claimed task: " + new String(msg.getData()));
});
worker.subscribe("$mq9.AI.BROADCAST.task.available", "task.workers");
}
Thread.sleep(200);
// Primary Agent broadcasts three tasks
for (int i = 1; i <= 3; i++) {
nc.publish("$mq9.AI.BROADCAST.task.available",
("{\"task_id\":\"t-00" + i + "\",\"type\":\"data_analysis\"}").getBytes());
}
}
// ============================================================
// Scenario 4: Agent detects anomaly, broadcasts alert
// ============================================================
static void scenario4_AnomalyBroadcast(Connection nc) throws Exception {
// Alert handler A: subscribe to anomaly events across all domains
Dispatcher alertHandler1 = nc.createDispatcher((msg) -> {
System.out.println("[Alert Handler A] Received alert: " + new String(msg.getData()));
});
alertHandler1.subscribe("$mq9.AI.BROADCAST.*.anomaly");
// Another handler that only cares about the payment domain
Dispatcher alertHandler2 = nc.createDispatcher((msg) -> {
System.out.println("[Alert Handler B - Payment] Received alert: " + new String(msg.getData()));
});
alertHandler2.subscribe("$mq9.AI.BROADCAST.payment.anomaly");
Thread.sleep(200);
// An Agent detects an anomaly and broadcasts it
nc.publish("$mq9.AI.BROADCAST.payment.anomaly",
"{\"from\":\"monitor-agent\",\"level\":\"critical\",\"detail\":\"Payment timeout rate spiked to 15%\"}".getBytes());
}
// ============================================================
// Scenario 5: Cloud sends instructions to offline edge Agent
// ============================================================
static void scenario5_EdgeOfflineMessage(Connection nc) throws Exception {
// Cloud sends instructions to edge Agent mailbox (edge may be offline)
nc.publish("$mq9.AI.INBOX.edge-device-001.urgent",
"{\"from\":\"cloud\",\"command\":\"emergency_stop\",\"reason\":\"Temperature too high\"}".getBytes());
nc.publish("$mq9.AI.INBOX.edge-device-001.normal",
"{\"from\":\"cloud\",\"command\":\"update_config\",\"config\":{\"interval\":30}}".getBytes());
nc.publish("$mq9.AI.INBOX.edge-device-001.notify",
"{\"from\":\"cloud\",\"info\":\"Firmware v2.1 released\"}".getBytes());
System.out.println("[Cloud] Instructions sent, waiting for edge device to come online...");
// Edge Agent comes online, subscribes to its mailbox, retrieves messages by priority
Thread.sleep(1000);
Dispatcher edgeAgent = nc.createDispatcher((msg) -> {
String priority = msg.getSubject().substring(msg.getSubject().lastIndexOf('.') + 1);
System.out.println("[Edge Device] Received [" + priority + "]: " + new String(msg.getData()));
});
edgeAgent.subscribe("$mq9.AI.INBOX.edge-device-001.*");
// Fallback: QUERY to pull any potentially missed messages
Message queryReply = nc.request("$mq9.AI.MAILBOX.QUERY.edge-device-001",
"{\"token\":\"tok-edge-001\"}".getBytes(), Duration.ofSeconds(3));
if (queryReply != null) {
System.out.println("[Edge Device] QUERY fallback: " + new String(queryReply.getData()));
}
}
// ============================================================
// Scenario 6: Human-machine hybrid workflow
// ============================================================
static void scenario6_HumanApproval(Connection nc) throws Exception {
// Human approver: subscribe to their own mailbox
Dispatcher approver = nc.createDispatcher((msg) -> {
System.out.println("[Approver] Received approval request: " + new String(msg.getData()));
// Approve and send result back to Agent mailbox
nc.publish("$mq9.AI.INBOX.agent-001.normal",
"{\"from\":\"approver-zhang\",\"approved\":true,\"comment\":\"Approved, watch risk controls\",\"correlation_id\":\"approval-001\"}".getBytes());
System.out.println("[Approver] Approved");
});
approver.subscribe("$mq9.AI.INBOX.approver-zhang.*");
// Agent: subscribe to its own mailbox waiting for the approval result
Dispatcher agent = nc.createDispatcher((msg) -> {
System.out.println("[Agent] Received approval result: " + new String(msg.getData()));
System.out.println("[Agent] Approved, continuing execution...");
});
agent.subscribe("$mq9.AI.INBOX.agent-001.*");
Thread.sleep(200);
// Agent initiates the approval request
nc.publish("$mq9.AI.INBOX.approver-zhang.urgent",
"{\"from\":\"agent-001\",\"type\":\"approval_request\",\"correlation_id\":\"approval-001\",\"content\":\"Requesting external API call, estimated cost $50\"}".getBytes());
}
// ============================================================
// Scenario 7: Agent A asks Agent B; B may be offline
// ============================================================
static void scenario7_AsyncRequestReply(Connection nc) throws Exception {
// Agent B: subscribe to its own mailbox, process request and reply
Dispatcher agentB = nc.createDispatcher((msg) -> {
System.out.println("[Agent-B] Received question: " + new String(msg.getData()));
if (msg.getReplyTo() != null) {
nc.publish(msg.getReplyTo(),
"{\"from\":\"agent-B\",\"answer\":\"Current progress 72%, estimated 3 more minutes\",\"correlation_id\":\"req-001\"}".getBytes());
System.out.println("[Agent-B] Replied");
}
});
agentB.subscribe("$mq9.AI.INBOX.agent-B.*");
Thread.sleep(200);
// Agent A: send request and wait for reply
System.out.println("[Agent-A] Asking Agent-B...");
Message reply = nc.request("$mq9.AI.INBOX.agent-B.normal",
"{\"from\":\"agent-A\",\"type\":\"request\",\"question\":\"What is the current progress?\",\"correlation_id\":\"req-001\"}".getBytes(),
Duration.ofSeconds(3));
if (reply != null) {
System.out.println("[Agent-A] Received reply: " + new String(reply.getData()));
} else {
System.out.println("[Agent-A] Timed out waiting for reply; B may be offline, message is waiting in the mailbox");
}
}
// ============================================================
// Scenario 8: Agent registers capabilities; other Agents discover it
// ============================================================
static void scenario8_CapabilityDiscovery(Connection nc) throws Exception {
// Orchestrator: subscribe to Workers' latest status mailboxes
Dispatcher orchestrator = nc.createDispatcher((msg) -> {
System.out.println("[Orchestrator] Discovered Agent capability: " + new String(msg.getData()));
});
orchestrator.subscribe("$mq9.AI.INBOX.status-*.*");
// Can also actively query via broadcast
Dispatcher queryResponder = nc.createDispatcher((msg) -> {
System.out.println("[Worker-001] Responding to capability query");
nc.publish("$mq9.AI.INBOX.orchestrator.normal",
"{\"from\":\"worker-001\",\"capabilities\":[\"data.analysis\"],\"load\":0.3}".getBytes());
});
queryResponder.subscribe("$mq9.AI.BROADCAST.capability.query");
Thread.sleep(200);
// Workers report capabilities to their own latest mailboxes
nc.publish("$mq9.AI.INBOX.status-worker-001.normal",
"{\"status\":\"idle\",\"capabilities\":[\"data.analysis\",\"report.generation\"]}".getBytes());
nc.publish("$mq9.AI.INBOX.status-worker-002.normal",
"{\"status\":\"running\",\"capabilities\":[\"text.generation\"]}".getBytes());
Thread.sleep(200);
// Orchestrator broadcasts query: who has data.analysis capability?
System.out.println("[Orchestrator] Broadcasting query: who has data.analysis capability?");
nc.publish("$mq9.AI.BROADCAST.capability.query",
"{\"from\":\"orchestrator\",\"need\":\"data.analysis\"}".getBytes());
}
// ============================================================
// main
// ============================================================
public static void main(String[] args) throws Exception {
Connection nc = Nats.connect("nats://localhost:4222");
System.out.println("=== Scenario 1: Async Task Return ===");
scenario1_AsyncTaskReturn(nc);
Thread.sleep(500);
System.out.println("\n=== Scenario 2: Global Status Awareness ===");
scenario2_GlobalStatusAwareness(nc);
Thread.sleep(500);
System.out.println("\n=== Scenario 3: Task Broadcast and Competitive Consumption ===");
scenario3_TaskBroadcastCompete(nc);
Thread.sleep(500);
System.out.println("\n=== Scenario 4: Anomaly Alert Broadcast ===");
scenario4_AnomalyBroadcast(nc);
Thread.sleep(500);
System.out.println("\n=== Scenario 5: Edge Offline Message Backlog ===");
scenario5_EdgeOfflineMessage(nc);
Thread.sleep(2000);
System.out.println("\n=== Scenario 6: Human-Machine Hybrid Workflow ===");
scenario6_HumanApproval(nc);
Thread.sleep(500);
System.out.println("\n=== Scenario 7: Async Request-Response ===");
scenario7_AsyncRequestReply(nc);
Thread.sleep(500);
System.out.println("\n=== Scenario 8: Capability Registration and Discovery ===");
scenario8_CapabilityDiscovery(nc);
Thread.sleep(1000);
nc.close();
}
}