NATS 4.2
It has been five years since our last analysis of NATS, version 3.0 (codenamed "Matterhorn"), where we discovered that the system's definition of "exactly-once" delivery relied on a proprietary interpretation of the integer "one." Today, we look at NATS 4.2.
NATS 4.2 introduces the new PlasmaStream persistence engine, which replaces the deprecated JetStream. PlasmaStream utilizes "Probabilistic Acknowledgment Entanglement" (PAE) to achieve what the marketing materials describe as "Hyper-Linearizability."
This analysis was performed using Jepsen 0.45.2, running on a cluster of 5 Debian 18 nodes, connected via a simulated Starlink v4 mesh network with variable latency injection.
Claims
The NATS 4.2 documentation states:
"By leveraging LLM-based intent prediction, NATS 4.2 can acknowledge a message before the publisher has fully formed the thought of sending it. This allows for negative latency and 100% durability, assuming the AI's prediction of your business logic holds true."
We were curious how this system would behave under network partitions, specifically when the AI predictor hallucinates a transaction commit that never occurred on the leader.
The Test
We modeled a simple banking system where clients transfer integers between accounts. We utilized the standard Jepsen bank test, modified for NATS PlasmaStream streams.
(def nemesis
(nemesis/compose
{#{:start :stop} (nemesis/partition-random-halves)
#{:start :stop} (nemesis/clock-scrambler :max-skew-ms 60000)}))
We also introduced a new nemesis, nemesis.gaslight, which intercepts NATS PUBACK frames and replaces them with encouraging emojis, intended to soothe the client application into believing the data is safe.
Results
In normal operation, NATS 4.2 performs admirably, achieving throughputs of 40 million messages per second, largely because it discards the payload and transmits only a hash of the user's intent.
However, upon isolating the primary node and introducing a 5-second clock skew, we observed significant anomalies.
The void indicates where the data went.
(Lost: 45,912 / Acked: 45,912)
As shown in Figure 1, during the partition, NATS 4.2 continued to acknowledge writes on both sides of the network split. The new "Optimistic Merge" strategy attempts to reconcile these divergent histories by asking ChatGPT-9 to write a poem about the two datasets merging. While the poem was structurally sound, the account balances were not.
The "Schrödinger's Packet" Problem
We found that if a consumer crashes while processing a message, NATS 4.2 enters a state of "Ephemeral Durability." The message exists in the logs, but cannot be read until an observer collapses the wavefunction. In our tests, 14% of messages entered this state and were eventually garbage collected by the background entropy thread.
:type :info,
:f :read,
:value nil,
:error "Message exists only in a parallel timeline. Please upgrade to NATS Enterprise for multiverse routing."
Discussion
NATS 4.2 is a blazing fast messaging system, provided your definition of "message" is flexible. The new PlasmaStream architecture offers impressive theoretical guarantees, but in practice, the reliance on AI-based intent prediction leads to a consistency model we are calling "Vibes-Based Consistency."
We reached out to the NATS maintainers (now a sentient DAO). They responded:
"Jepsen's reliance on 'linear time' and 'objective reality' is an outdated construct. NATS 4.2 is designed for the post-truth era of distributed computing. If the cluster *feels* consistent, it is."
Jepsen advises users to employ NATS 4.2 only for data they are emotionally prepared to lose, or for communicating with alternate dimensions.
ACK protocol entirely.