What Should the Future of Messaging Look Like: Some Thoughts

I have been thinking about a question: what should the future of messaging infrastructure look like?

Kafka has dominated for a decade. RabbitMQ even longer. Will the next decade still belong to them? I remain skeptical — not because they are bad; on the contrary, they have been refined to near perfection in their respective domains. But precisely because they have been refined to near perfection, the right question is: what comes next?

If the answer is "keep optimizing on existing dimensions" — higher throughput, lower latency, cheaper storage — I think that road is approaching its end. Not that improvement is impossible, but the marginal returns are shrinking fast. A million TPS, millisecond latency: who cannot achieve that today? As long as your architecture avoids fundamental mistakes, anyone working seriously on this can get there. These are table stakes, not differentiators. If I can do it, you can too, and so can everyone who takes it seriously. Being 30% faster makes no meaningful difference for the vast majority of use cases.

So to see clearly what the future of messaging should look like, the angle of inquiry itself has to change. Not continuing to compete on existing dimensions, but asking: once all these dimensions have been optimized to their limits, where do we go? Especially now that AI has arrived and Agents have become a new variable — the landscape of communication is fundamentally different from before.

Below are some of my thoughts. They may not all be right, but they are things I have turned over many times.

On Traditional Metrics

Kafka won on throughput — million-level TPS crushed every contemporary competitor, and performance was its core differentiator. RabbitMQ won on flexible routing and reliable delivery. EMQX won on massive connection scale. NATS won on low-latency microservice communication. Each had its hallmark.

But looking at these metrics today, I think their role has changed.

Hardware has changed enormously over the past decade. NVMe SSD random write performance, 100G network bandwidth, steadily falling memory prices — these advances mean that any new system with a sound architecture starts from a high performance baseline. Performance is no longer your advantage; it is something you must have. If you are building a new messaging system today and you are still emphasizing "I am 30% faster than Kafka," you are navigating with an old map.

Similarly, the cloud era has trained everyone to care about cost. Disaggregated storage and compute, tiered storage, pay-per-use — these are now consensus. Kafka's storage cost problems spawned Pulsar's tiered storage; Confluent built the Kora architecture; AWS built MSK Serverless. Cost optimization is no longer innovation; it is a baseline competency.

Performance, throughput, latency, cost, stability — in today's technical environment these should be defaults. Not selling points — entry requirements.

Yet I notice that when discussing message queues, including in my own conversations, we still open with throughput numbers, latency figures, and cost optimization. Not because these do not matter, but because we are so habituated to this framing. Messaging means Topic/Partition. Messaging means append-only. Messaging means Kafka, and somehow that is the end of the line.

Disaggregated storage was progress. Object storage for the persistence layer was progress. What comes after that? When the traditional metrics all become defaults, the real questions surface.

When performance and cost are no longer differentiators, what should the core focus become? I believe it should be reducing organizational system complexity. What does enterprise messaging infrastructure actually look like today? Running Kafka for stream processing, RabbitMQ for business decoupling, EMQX for IoT ingestion, and NATS for microservice communication — simultaneously. Four systems, four clusters to operate, four monitoring and alerting setups, four learning curves. Each product is excellent in isolation, but organizations pile them up to cover all scenarios, and the piling itself is enormous cost. And it is hidden cost: compute bills are visible, human attention is not. Being paged at 3am to trace a missing message across three systems — that cost never appears on any report, but it is real, and it compounds as systems accumulate.

The industry is moving in this direction. Whether it is Tableflow, Topic integration with Iceberg, or various Lakehouse experiments, they all fundamentally want to make organizational infrastructure simpler. The direction is right. But something still feels missing — not technical capability, but a mental model that has not fully shifted. Many efforts still optimize within the existing frame, competing on performance and cost but with a different posture.

So in the non-traditional dimension, I believe "letting organizations run fewer systems" is the highest-value thing. Not building a faster Kafka, but building a unified infrastructure that covers Kafka + RabbitMQ + MQTT Broker scenarios. What you eliminate is not a performance gap — it is architectural complexity.

On Multi-Protocol

When the industry talks about multi-protocol support, the narrative is usually "compatible with your existing Kafka clients" or "supports MQTT 3.1.1 and 5.0." I think this framing is too narrow.

The real significance of multi-protocol is not compatibility — it is elimination. Protocol A in, protocol B out means that an architecture that once required a Kafka cluster plus an MQTT Broker plus a bridging middleware can now be a single system. IoT devices ingest data over MQTT; backends do stream processing over the Kafka protocol; microservices communicate over AMQP or NATS — all running on one underlying engine, sharing storage, sharing operations.

The difficulty is not "supporting multiple protocols." Protocol parsing is deterministic engineering work. The real challenge is this: different protocols have very different semantic models. Kafka is an offset-based log-append model; AMQP is an Exchange/Queue routing model; MQTT is a topic-based pub/sub model. Unifying these different semantic models on a single storage engine — while preserving semantic correctness for each protocol and not degrading performance through abstraction overhead — is the genuine architectural challenge.

But it is worth doing, because once achieved, organizational messaging complexity undergoes a qualitative change. Not 10-20% optimization — from four systems to one.

On the Arrival of Agents

Everything above is still within the frame of "async communication between systems." If you only see multi-protocol unification, what you are building is a better message queue — better, but still fundamentally a message queue.

What truly made me feel the need to reconsider the entire space is that the participants in communication are changing.

Traditional message queues rest on a set of deeply embedded assumptions: producers and consumers are determinate, long-lived service instances. Topics or Queues are planned by architects during system design. Message flow is static — from A to B, from B to C, drawn on an architecture diagram, running in production, unchanged for years.

The Agent era breaks all of these assumptions.

Agents are dynamically created and destroyed. One Agent might exist for seconds to complete a task and then disappear; another might run for months on a long-horizon project. The communication relationship between two Agents is established on the fly — nobody planned this link in advance. An Agent is simultaneously sender and receiver; messages it receives may trigger new Agent creation, which triggers more communication. And crucially, messages are no longer equal: some are routine async notifications, some are emergency commands requiring immediate action, some are critical data that can be delayed but not lost.

Forcing Topic/Queue models onto this communication pattern works — but awkwardly. Because the underlying abstraction is wrong.

The fundamental abstraction in traditional message queues is a "pipe" — messages flow in one end, out the other. The pipe is passive and stateless; it does not care who uses it. What Agent communication needs may be a "communication entity" abstraction: each participant owns its communication space; messages are not flowing through a pipe but delivered into an entity's space. The entity has a lifecycle — it can be created and destroyed. Messages have priority; urgent ones can jump the queue. Communication relationships are dynamically established and dissolved without advance planning.

This is not about replacing the pipe model. For traditional system-to-system decoupling — an order service sending messages to an inventory service, a log collector feeding a data pipeline — the pipe model remains the best solution and does not need to change. But Agent communication is an entirely new scenario, and using the old abstraction imposes unnecessary complexity.

A forward-looking messaging infrastructure should support both models simultaneously. A shared storage and operations layer underneath; different communication abstractions on top, chosen by scenario. Like a database that offers SQL for relational queries and a Document API for flexible document workloads — same underlying engine, different interfaces.

On Doing Hard Things

Traditional dimensions have been optimized to their limits; marginal returns have been competed away; everything that can be optimized has been. What should come next? Hard things.

Easy things are what everyone is doing, and everyone has gotten to roughly the same place. What remains — what is worth doing — is what has been dismissed as "too hard, not worth it." Now it needs to be reexamined.

Take language choice. For the past decade and a half, the mainstream choices in the messaging middleware space have been Java and Go. Kafka is Java + Scala; RabbitMQ is Erlang; NATS is Go; Pulsar is Java. There were practical reasons: faster development, mature ecosystems, easier hiring. Performance was not as good as C/C++ or Rust, but "good enough."

"Good enough" is a judgment from the old era. When performance becomes a default entry requirement, you should pursue the extreme, not adequacy. Rust and C++'s advantages in memory management, zero-copy, and system call efficiency are determined at the language level — they cannot be compensated by JVM tuning. The historical reason not to choose Rust was "development is slow and talent is hard to find." But in the AI Coding era this constraint is rapidly loosening. AI-assisted development largely compensates for Rust's development speed disadvantage. When language learning cost and development velocity are no longer hard bottlenecks, why not pursue more extreme baseline performance and memory safety?

Take multi-protocol semantic unification. Unifying Kafka's log model, AMQP's routing model, and MQTT's pub/sub model on a single storage engine — nobody has done this, not because nobody thought of it, but because it is too hard. But "hard" is not a reason not to do it. On the contrary, hard things are where moats live.

Take designing Agent-native communication abstractions from scratch. Not adding an SDK on top of an existing message queue and calling it done, but rethinking from the ground up: the lifecycle of communication entities, the priority model, dynamic routing. There is no mature reference architecture to borrow — you have to find the path yourself.

AI has changed the cost structure of many things. Paths that were not feasible before are feasible now. Judgments that were "too hard, not worth it" need to be re-evaluated.

Closing Thoughts

One deep feeling I have had throughout this thinking: if you keep looking at messaging infrastructure through the lens of the internet era and the cloud era, you cannot see the next step clearly.

Because the core variables in those frameworks — throughput, latency, cost, elasticity — have been optimized to their extremes. Continuing to compete on these dimensions, the most you can do is micro-adjust. Real breakthroughs require stepping back and re-examining: who is communicating, how are they communicating, what is the lifecycle of communication. The answers to these questions are different from what they were ten years ago.

I do not dismiss what existing products have achieved. We always stand on the shoulders of giants to move forward — becoming larger giants for the next person to stand on. That is infrastructure. Each generation pushes the whole field forward; none of them is permanent, but each one advances the state of the art significantly.

The next step is expanding the dimensions themselves. Horizontally: one infrastructure covering multiple communication scenarios, genuinely reducing organizational system complexity. Vertically: native abstraction support for the new communication participants of the Agent era.

These are hard things. But the harder the thing, the more worth doing it is. Easy things are what everyone is doing — the end state is price competition and operational efficiency. Hard things are what few do, and once done, the moat is natural.

Fixing yourself in the single perspective of "messaging" or "streaming" makes it impossible to see the future clearly. You need to step back and look at this space from the height of "communication infrastructure."

I think the most dangerous thing is not doing it wrong — it is not daring to think. Infrastructure always moves forward. The answers may not be fully clear yet, but the direction, I believe, is this one.

What Should the Future of Messaging Look Like: Some Thoughts ​

On Traditional Metrics ​

On Multi-Protocol ​

On the Arrival of Agents ​

On Doing Hard Things ​

Closing Thoughts ​