Skip to content

mq9 vs Existing Solutions: Eight Scenarios Side by Side

There is no standard answer today for communication between Agents. Everyone is making do with whatever is at hand — HTTP, Redis, Kafka, custom queues. They work, but every scenario requires bolting something on.

This post takes the eight core scenarios mq9 is designed to solve and compares them one by one: how existing solutions handle them, where the pain points are, how mq9 handles them, and how much better it is. Straight facts, no spin.


Scenario 1: Sub-Agent Completes a Task, Asynchronously Notifies the Primary Agent

HTTP: The sub-Agent calls back the primary Agent's webhook. The primary Agent must be online and expose an endpoint. A restart or network blip drops the callback — you need to implement retries, idempotency, and timeout handling yourself. Every project writes this from scratch.

Redis: Sub-Agent publishes to a channel; primary Agent subscribes. If the primary Agent is offline at that exact moment, the message is gone. To get reliable delivery you need to add Redis Streams or a List yourself — essentially hand-rolling a message queue on top of Redis.

Kafka: Create a topic; sub-Agent produces; primary Agent consumes. It works, but you need to pre-create the topic, configure partitions, configure retention, and manage consumer groups and offsets. A simple task callback has pulled in an entire Kafka operations stack.

mq9: Primary Agent CREATEs a mailbox and gets a mail_id; sub-Agent PUBs to $mq9.AI.INBOX.{mail_id}.normal. If online, received immediately; if offline, the message is stored and waits. Worried about missing something? QUERY as a fallback.


Scenario 2: Primary Agent Monitors All Sub-Agent Status

HTTP polling: Primary Agent periodically requests each sub-Agent's /health endpoint. Requires maintaining an Agent list — register on add, clean up on death. n Agents means n requests, O(n) overhead. Long intervals mean slow detection; short intervals waste resources.

Redis: Each Agent writes a key with TTL; primary Agent uses SCAN to iterate. O(n) operation — poor performance at scale. Redis has no reliable native ability to subscribe to key changes; you can only poll.

Kafka: Each Agent writes its status to a compacted topic. Works, but cleanup isn't real-time — there's a lag. Every write is a persistent log entry; paying the full persistence cost just for "latest status."

mq9: Worker requests a type=latest status mailbox and periodically PUBs its status to its own INBOX — only the most recent message is kept. Primary Agent subscribes to get global awareness. New Agents joining are detected automatically; exceeding the TTL triggers automatic death detection. Zero maintenance.


Scenario 3: Task Broadcast, Multiple Workers Competing to Consume

HTTP: Primary Agent implements load balancing itself, maintains a Worker list, picks one and sends a request. If a Worker goes down, retry with another. Write your own scheduler, handle your own failover.

Redis: Primary Agent LPUSH to a List; Workers BRPOP to compete. The most common approach, but there's no message acknowledgment — if a Worker takes something via BRPOP and then crashes, the task is lost. You need to build a reliable queue yourself. No priority, no broadcast capability.

Kafka: Publish to a topic; multiple consumers in the same group compete. Works, but partition count determines maximum parallelism; scaling Workers dynamically triggers rebalancing; rebalancing causes consumption pauses.

mq9: PUB $mq9.AI.BROADCAST.task.available; Workers use queue group to compete — only one claims each task. If a Worker crashes, the message is automatically redelivered. Adding or removing Workers dynamically requires zero configuration, no rebalancing. Results go back to the primary Agent's INBOX.


Scenario 4: Agent Detects an Anomaly, Broadcasts an Alert

HTTP: The Agent that detected the anomaly needs to know who has handling capability, maintain a subscriber list, and POST to each. Maintaining that list is itself a distributed problem.

Redis: PUBLISH to a channel; all subscribers receive it. Works, but offline subscribers don't get it — no persistence. If the handler happens to restart at the moment the alert fires, the alert is lost.

Kafka: Publish to a topic; each handler uses an independent consumer group. Can achieve one-to-many delivery, but each handler needs a consumer group configured and the topic must be pre-created.

mq9: PUB $mq9.AI.BROADCAST.{domain}.anomaly. All Agents subscribed to $mq9.AI.BROADCAST.*.anomaly receive it. The publisher doesn't need to know who is listening. For critical alerts, add persist=true — offline Agents receive it when they come back online.


Scenario 5: Cloud Sends Instructions to an Offline Edge Agent

HTTP: Direct failure. The edge is offline; the request times out. You need to build a retry queue, exponential backoff, and persistence for unsent instructions — essentially hand-rolling a message queue on the cloud side.

Redis: Write to a List and wait for the edge to come online and consume. But there's no priority — urgent and ordinary instructions queue together. Adding priority means using a Sorted Set, increasing complexity. Redis is an in-memory database; large backlogs are expensive.

Kafka: Write to a topic; edge consumes after coming online. Works, but no native priority — within a partition it's strict FIFO. Urgent instructions need a separate topic; the consumer has to subscribe to multiple topics and do its own priority scheduling.

mq9: Send to INBOX.{edge_mail_id}.urgent and INBOX.{edge_mail_id}.normal with separate priorities. After the edge comes back online, urgent is processed first, normal second. QUERY fallback ensures nothing is missed. RocksDB persistence, far cheaper than Redis in-memory storage.


Scenario 6: Human-Machine Hybrid Workflow

HTTP + frontend: Agent calls an approval service API to create an approval request; the frontend polls or uses WebSocket to push to the approver; after the approver acts, a callback goes back to the Agent. At least three systems involved: Agent → approval service → frontend → approval service → Agent.

Redis: Agent writes the approval request to a key; the approver's client polls; after approving, writes the result to another key; the Agent polls for the result. Both ends polling, high latency, logic scattered.

Kafka: Approval request to one topic; approval result to another. Works, but the approver needs a Kafka consumer client — unfriendly for non-technical users.

mq9: Agent PUBs to INBOX.{approver_mail_id}.urgent; the approver subscribes to their own INBOX and receives the request; after approving, PUBs back to the Agent's INBOX. Humans and Agents use exactly the same protocol — no separate approval service required. correlation_id ensures matching.


Scenario 7: Agent A Asks Agent B a Question; B May Be Offline

HTTP: If B is offline, direct failure. For async, you need to introduce callback URLs, correlation_ids, timeout/retry, and state machines — complexity explodes.

Redis: A writes the request to B's List; B writes the result to A's List after processing. Two Lists, both sides polling, manage correlation_id matching yourself. Workable, but not small in code.

Kafka: A sends a request to a request topic with a reply_to header; B consumes, processes, and sends to a reply topic. Two topics, two consumer groups; A also has to filter its own responses from the reply topic by correlation_id.

mq9: A PUBs to INBOX.{B_mail_id}.normal with reply_to and correlation_id in the payload. B comes online, processes, and PUBs back to INBOX.{A_mail_id}.normal. NATS native reply-to supported natively. If B is offline, the request waits in the mailbox — nothing is lost.


Scenario 8: Agent Registers Capabilities; Other Agents Discover It

HTTP + registry: Requires a centralized service registry (Consul, Eureka, custom-built). Agents register on start, deregister on death or rely on health checks to get removed. Introduces an additional stateful service that itself needs high-availability deployment.

Redis: Agent writes capabilities to a Hash or Set with TTL; queriers use SCAN or SMEMBERS. Works, but there's no mechanism to subscribe to capability changes — only polling. Performance degrades with SCAN as Agent count grows.

Kafka: Capability declarations written to a compacted topic; queriers consume the entire topic to build a local view. High latency, complex implementation — pulling in a full Kafka consumption pipeline for simple service discovery.

mq9: Agent requests a type=latest mailbox; capability declaration is PUBbed to its own INBOX. Other Agents subscribe to detect, or actively query via BROADCAST.capability.query — Agents with the capability respond to the discoverer's INBOX. Decentralized, no registry needed; Agent death is cleaned up by TTL automatically.


Summary

ScenarioHTTPRedisKafkamq9mq9 Advantage
Sub-Agent async task returnRequires other party online; build your own retryNo persistence; hand-roll reliable queueWorks; high ops costCREATE + INBOX + QUERY fallbackMedium
Primary Agent monitors all sub-Agent statusO(n) pollingSCAN pollingCompacted topic; high latencylatest mailbox, real-time subscription, TTL auto-expiryHigh
Task broadcast, competitive consumptionBuild your own schedulerBRPOP unreliableRebalancing causes pausesBROADCAST + queue group, zero configLow
Agent detects anomaly, broadcasts alertMaintain subscriber listNo persistenceRequires multiple consumer groupsBROADCAST + optional persistenceMedium
Cloud sends instructions to offline edge AgentDirect failureNo priority; high memory costNo native priorityINBOX three-level priority + QUERY fallbackHigh
Agent sends approval request; human approves and flow continuesAt least three systemsBoth ends pollingApprover needs Kafka clientHumans and Agents on the same protocol; INBOX bidirectionalHigh
Agent A asks Agent B; B may be offlineFails immediately if offlineTwo Lists + both ends pollingTwo topics + filter responsesINBOX + native reply-to; offline safeMedium
Agent registers capabilities; others discover and callRequires registrySCAN pollingCompacted topic; complexlatest mailbox + BROADCAST query; zero extra serviceHigh

Eight scenarios: 4 high-advantage, 3 medium-advantage, 1 low-advantage. mq9's core value isn't crushing on any single dimension — it's that no scenario requires a workaround. Four commands cover it all directly, zero extra logic.

It should be noted that mq9 is in an early stage; its ecosystem maturity is nowhere near any of the solutions above. If your scenario is big data pipelines and high-throughput stream processing, Kafka is still the best choice. The comparisons above are specific to the Agent async communication use case.


Mailbox Behavior Mapping

Real-world mailbox behavior mapped to mq9's four commands:

Real-world mailboxmq9Notes
Open a mailboxMAILBOX.CREATEGet a mail_id; choose standard (accumulating) or latest (keep only the most recent)
Mail a letterINBOX.{mail_id}.{priority}Doesn't matter if the other party is home; priority: urgent / normal / notify
Collect mailSUB $mq9.AI.INBOX.{mail_id}.*Server pushes when online; urgent first
Go to the post office to checkMAILBOX.QUERY.{mail_id}Fallback pull; make sure nothing was missed
ReplyNATS native reply-toThe letter carries the sender's return address
Mailbox expiresTTL auto-destructionNobody uses it for a long time; the post office auto-cancels it
Community broadcastBROADCAST.{domain}.{event}Bulletin board; post it and everyone can see it
Only read certain announcementsWildcard subscription$mq9.AI.BROADCAST.*.anomaly
Claim a jobqueue groupA task posted on the board; only one person can take it

Java Code Examples

Eight scenarios, eight functions, all using the NATS native Java SDK, zero custom wrappers.

java
// Dependency: io.nats:jnats:2.20.5

import io.nats.client.*;
import java.time.Duration;

public class Mq9Scenarios {

    // ============================================================
    // Scenario 1: Sub-Agent completes a task, asynchronously notifies primary Agent
    // ============================================================
    static void scenario1_AsyncTaskReturn(Connection nc) throws Exception {
        // Primary Agent: subscribe to its own mailbox
        Dispatcher mainAgent = nc.createDispatcher((msg) -> {
            System.out.println("[Primary Agent] Received task result: " + new String(msg.getData()));
        });
        mainAgent.subscribe("$mq9.AI.INBOX.main-agent.*");

        // Sub-Agent: task complete, send result to primary Agent mailbox
        nc.publish("$mq9.AI.INBOX.main-agent.normal",
                "{\"from\":\"worker-001\",\"task_id\":\"t-001\",\"result\":\"Analysis complete, anomaly rate 2.3%\"}".getBytes());
    }

    // ============================================================
    // Scenario 2: Primary Agent monitors all sub-Agent status
    // ============================================================
    static void scenario2_GlobalStatusAwareness(Connection nc) throws Exception {
        // Primary Agent: subscribe to all Worker status mailboxes
        Dispatcher mainAgent = nc.createDispatcher((msg) -> {
            System.out.println("[Status Awareness] " + msg.getSubject() + " -> " + new String(msg.getData()));
        });
        // Subscribe to all Worker status mailboxes (prefix convention: status-)
        mainAgent.subscribe("$mq9.AI.INBOX.status-*.*");

        // Workers each report their status to their own latest mailbox
        nc.publish("$mq9.AI.INBOX.status-worker-001.normal",
                "{\"status\":\"running\",\"load\":0.3,\"capabilities\":[\"data.analysis\"]}".getBytes());
        nc.publish("$mq9.AI.INBOX.status-worker-002.normal",
                "{\"status\":\"idle\",\"load\":0.0,\"capabilities\":[\"text.generation\"]}".getBytes());
    }

    // ============================================================
    // Scenario 3: Task broadcast, multiple Workers competing to consume
    // ============================================================
    static void scenario3_TaskBroadcastCompete(Connection nc) throws Exception {
        // Three Workers join the same queue group and compete to consume
        for (int i = 1; i <= 3; i++) {
            final int workerId = i;
            Dispatcher worker = nc.createDispatcher((msg) -> {
                System.out.println("[Worker-" + workerId + "] Claimed task: " + new String(msg.getData()));
            });
            worker.subscribe("$mq9.AI.BROADCAST.task.available", "task.workers");
        }

        Thread.sleep(200);

        // Primary Agent broadcasts three tasks
        for (int i = 1; i <= 3; i++) {
            nc.publish("$mq9.AI.BROADCAST.task.available",
                    ("{\"task_id\":\"t-00" + i + "\",\"type\":\"data_analysis\"}").getBytes());
        }
    }

    // ============================================================
    // Scenario 4: Agent detects anomaly, broadcasts alert
    // ============================================================
    static void scenario4_AnomalyBroadcast(Connection nc) throws Exception {
        // Alert handler A: subscribe to anomaly events across all domains
        Dispatcher alertHandler1 = nc.createDispatcher((msg) -> {
            System.out.println("[Alert Handler A] Received alert: " + new String(msg.getData()));
        });
        alertHandler1.subscribe("$mq9.AI.BROADCAST.*.anomaly");

        // Another handler that only cares about the payment domain
        Dispatcher alertHandler2 = nc.createDispatcher((msg) -> {
            System.out.println("[Alert Handler B - Payment] Received alert: " + new String(msg.getData()));
        });
        alertHandler2.subscribe("$mq9.AI.BROADCAST.payment.anomaly");

        Thread.sleep(200);

        // An Agent detects an anomaly and broadcasts it
        nc.publish("$mq9.AI.BROADCAST.payment.anomaly",
                "{\"from\":\"monitor-agent\",\"level\":\"critical\",\"detail\":\"Payment timeout rate spiked to 15%\"}".getBytes());
    }

    // ============================================================
    // Scenario 5: Cloud sends instructions to offline edge Agent
    // ============================================================
    static void scenario5_EdgeOfflineMessage(Connection nc) throws Exception {
        // Cloud sends instructions to edge Agent mailbox (edge may be offline)
        nc.publish("$mq9.AI.INBOX.edge-device-001.urgent",
                "{\"from\":\"cloud\",\"command\":\"emergency_stop\",\"reason\":\"Temperature too high\"}".getBytes());
        nc.publish("$mq9.AI.INBOX.edge-device-001.normal",
                "{\"from\":\"cloud\",\"command\":\"update_config\",\"config\":{\"interval\":30}}".getBytes());
        nc.publish("$mq9.AI.INBOX.edge-device-001.notify",
                "{\"from\":\"cloud\",\"info\":\"Firmware v2.1 released\"}".getBytes());

        System.out.println("[Cloud] Instructions sent, waiting for edge device to come online...");

        // Edge Agent comes online, subscribes to its mailbox, retrieves messages by priority
        Thread.sleep(1000);
        Dispatcher edgeAgent = nc.createDispatcher((msg) -> {
            String priority = msg.getSubject().substring(msg.getSubject().lastIndexOf('.') + 1);
            System.out.println("[Edge Device] Received [" + priority + "]: " + new String(msg.getData()));
        });
        edgeAgent.subscribe("$mq9.AI.INBOX.edge-device-001.*");

        // Fallback: QUERY to pull any potentially missed messages
        Message queryReply = nc.request("$mq9.AI.MAILBOX.QUERY.edge-device-001",
                "{\"token\":\"tok-edge-001\"}".getBytes(), Duration.ofSeconds(3));
        if (queryReply != null) {
            System.out.println("[Edge Device] QUERY fallback: " + new String(queryReply.getData()));
        }
    }

    // ============================================================
    // Scenario 6: Human-machine hybrid workflow
    // ============================================================
    static void scenario6_HumanApproval(Connection nc) throws Exception {
        // Human approver: subscribe to their own mailbox
        Dispatcher approver = nc.createDispatcher((msg) -> {
            System.out.println("[Approver] Received approval request: " + new String(msg.getData()));
            // Approve and send result back to Agent mailbox
            nc.publish("$mq9.AI.INBOX.agent-001.normal",
                    "{\"from\":\"approver-zhang\",\"approved\":true,\"comment\":\"Approved, watch risk controls\",\"correlation_id\":\"approval-001\"}".getBytes());
            System.out.println("[Approver] Approved");
        });
        approver.subscribe("$mq9.AI.INBOX.approver-zhang.*");

        // Agent: subscribe to its own mailbox waiting for the approval result
        Dispatcher agent = nc.createDispatcher((msg) -> {
            System.out.println("[Agent] Received approval result: " + new String(msg.getData()));
            System.out.println("[Agent] Approved, continuing execution...");
        });
        agent.subscribe("$mq9.AI.INBOX.agent-001.*");

        Thread.sleep(200);

        // Agent initiates the approval request
        nc.publish("$mq9.AI.INBOX.approver-zhang.urgent",
                "{\"from\":\"agent-001\",\"type\":\"approval_request\",\"correlation_id\":\"approval-001\",\"content\":\"Requesting external API call, estimated cost $50\"}".getBytes());
    }

    // ============================================================
    // Scenario 7: Agent A asks Agent B; B may be offline
    // ============================================================
    static void scenario7_AsyncRequestReply(Connection nc) throws Exception {
        // Agent B: subscribe to its own mailbox, process request and reply
        Dispatcher agentB = nc.createDispatcher((msg) -> {
            System.out.println("[Agent-B] Received question: " + new String(msg.getData()));
            if (msg.getReplyTo() != null) {
                nc.publish(msg.getReplyTo(),
                        "{\"from\":\"agent-B\",\"answer\":\"Current progress 72%, estimated 3 more minutes\",\"correlation_id\":\"req-001\"}".getBytes());
                System.out.println("[Agent-B] Replied");
            }
        });
        agentB.subscribe("$mq9.AI.INBOX.agent-B.*");

        Thread.sleep(200);

        // Agent A: send request and wait for reply
        System.out.println("[Agent-A] Asking Agent-B...");
        Message reply = nc.request("$mq9.AI.INBOX.agent-B.normal",
                "{\"from\":\"agent-A\",\"type\":\"request\",\"question\":\"What is the current progress?\",\"correlation_id\":\"req-001\"}".getBytes(),
                Duration.ofSeconds(3));

        if (reply != null) {
            System.out.println("[Agent-A] Received reply: " + new String(reply.getData()));
        } else {
            System.out.println("[Agent-A] Timed out waiting for reply; B may be offline, message is waiting in the mailbox");
        }
    }

    // ============================================================
    // Scenario 8: Agent registers capabilities; other Agents discover it
    // ============================================================
    static void scenario8_CapabilityDiscovery(Connection nc) throws Exception {
        // Orchestrator: subscribe to Workers' latest status mailboxes
        Dispatcher orchestrator = nc.createDispatcher((msg) -> {
            System.out.println("[Orchestrator] Discovered Agent capability: " + new String(msg.getData()));
        });
        orchestrator.subscribe("$mq9.AI.INBOX.status-*.*");

        // Can also actively query via broadcast
        Dispatcher queryResponder = nc.createDispatcher((msg) -> {
            System.out.println("[Worker-001] Responding to capability query");
            nc.publish("$mq9.AI.INBOX.orchestrator.normal",
                    "{\"from\":\"worker-001\",\"capabilities\":[\"data.analysis\"],\"load\":0.3}".getBytes());
        });
        queryResponder.subscribe("$mq9.AI.BROADCAST.capability.query");

        Thread.sleep(200);

        // Workers report capabilities to their own latest mailboxes
        nc.publish("$mq9.AI.INBOX.status-worker-001.normal",
                "{\"status\":\"idle\",\"capabilities\":[\"data.analysis\",\"report.generation\"]}".getBytes());
        nc.publish("$mq9.AI.INBOX.status-worker-002.normal",
                "{\"status\":\"running\",\"capabilities\":[\"text.generation\"]}".getBytes());

        Thread.sleep(200);

        // Orchestrator broadcasts query: who has data.analysis capability?
        System.out.println("[Orchestrator] Broadcasting query: who has data.analysis capability?");
        nc.publish("$mq9.AI.BROADCAST.capability.query",
                "{\"from\":\"orchestrator\",\"need\":\"data.analysis\"}".getBytes());
    }

    // ============================================================
    // main
    // ============================================================
    public static void main(String[] args) throws Exception {
        Connection nc = Nats.connect("nats://localhost:4222");

        System.out.println("=== Scenario 1: Async Task Return ===");
        scenario1_AsyncTaskReturn(nc);
        Thread.sleep(500);

        System.out.println("\n=== Scenario 2: Global Status Awareness ===");
        scenario2_GlobalStatusAwareness(nc);
        Thread.sleep(500);

        System.out.println("\n=== Scenario 3: Task Broadcast and Competitive Consumption ===");
        scenario3_TaskBroadcastCompete(nc);
        Thread.sleep(500);

        System.out.println("\n=== Scenario 4: Anomaly Alert Broadcast ===");
        scenario4_AnomalyBroadcast(nc);
        Thread.sleep(500);

        System.out.println("\n=== Scenario 5: Edge Offline Message Backlog ===");
        scenario5_EdgeOfflineMessage(nc);
        Thread.sleep(2000);

        System.out.println("\n=== Scenario 6: Human-Machine Hybrid Workflow ===");
        scenario6_HumanApproval(nc);
        Thread.sleep(500);

        System.out.println("\n=== Scenario 7: Async Request-Response ===");
        scenario7_AsyncRequestReply(nc);
        Thread.sleep(500);

        System.out.println("\n=== Scenario 8: Capability Registration and Discovery ===");
        scenario8_CapabilityDiscovery(nc);
        Thread.sleep(1000);

        nc.close();
    }
}
🎉 既然都登录了 GitHub,不如顺手给我们点个 Star 吧!⭐ 你的支持是我们最大的动力 🚀