We shipped a real-time messaging foundation in a day. WebSocket connections, distributed routing across nodes, one-to-one and one-to-many sends, system messages, REST surface, projections, unit tests. The reason it was a day and not a week is the part worth talking about — and it's not the code.
The premise is deceptively small: humans messaging humans, the system messaging humans, and a group send fan-out, all on the same channel, distributed so it doesn't matter which node holds the user's socket. The naive build is one handler, a payload, and a connection lookup. That gets you the first scenario. It does not survive the third — and "doesn't survive" here means the code stops being readable, not that it stops working.
So the day began with four design choices and a refusal to compromise on any of them.
One — the message kind is a first-class type, not a payload discriminator.
This is the most boring decision and the highest-leverage one. UserToUser, UserToUsers, the various system message kinds — each gets its own type, its own command, its own handler. Routing reads the static type, not a string inside a JSON envelope. No payload sniffing. No switch buried two layers deep in a dispatcher that everyone is afraid to touch.
The payoff isn't aesthetic. It's velocity. Adding a fourth kind — say, a tenant-wide broadcast tomorrow — becomes a 30-minute change because the model already has a slot shaped like it. There's a Marten projection, a Wolverine command, a route, and a test pattern, all keyed on the type. You add the type, you slot in. No invisible coupling to the existing kinds, no risk of nudging an unrelated case.
The biggest velocity multiplier on day one is the type system you laid down on day zero.
Two — fan-out happens at compose time, not at receive time.
The temptation when you build "send to many" is to write the message once and let N readers pick themselves out on the read side. It feels efficient. It is wrong.
Per-recipient state — read, archived, dismissed, escalated — is genuinely per recipient. The moment one user archives a group message and another doesn't, a shared event with N filters becomes ambiguous: whose state lives where, who's authoritative, what does "the message" even mean now? You spend the next quarter writing read-side filters and apologizing for an inbox that flickers.
So the compose handler does the resolution. A UserToUsers command resolves recipients up front and emits per-recipient writes. The inbox projection is keyed (recipient, message), not (message, [recipients]). The read model never has to apologize for itself. Each user has a clean, plain, boring inbox — and a boring inbox is the user-facing miracle of the whole system.
This was the bug I caught mid day and fixed: the inbox read model was initially keyed wrong, and the test that flushed it out cost more to write than the fix did. That's the right ratio.
Three — Redis is the routing fabric; Wolverine 'plus' Marten are the model.
A distributed connection registry lives in Redis, with pub/sub for cross-node delivery. If a user's socket is held by node A and the message originates on node B, node A gets notified and pushes the frame. Standard fan-out-over-pub/sub, nothing exotic — but the line matters.
The model — what a message is, what events it raises, what the inbox projection looks like, how the daemon catches up after restart — lives in Wolverine command handlers and Marten projections. None of them talks to Redis. Redis is the transport detail; if we swap it for a different broker tomorrow, the domain doesn't notice.
This is the part that gets phrased as a principle in retrospectives but lost in practice: the infrastructure follows the model, not the reverse. Most "real-time messaging" systems I've seen are the inverse — the model gets shaped by the broker, and a year later nobody can find where the business logic actually lives. The discipline of keeping Redis out of the aggregate is what makes the model legible later.
A small but related move: a typed RedisOptions config instead of strings-and-prayer. Boring on day one, load-bearing on day ninety.
Four — the projection daemon is a host-level concern, not a per-endpoint one.
This one's almost invisible if you don't know to look for it. Marten projections are wired into the host with a daemon that catches up on startup and stays online. The temptation is to scope projections to whichever endpoint is "the messaging one." That couples projection lifecycle to deployment topology — change the topology, lose the projection.
Wiring the daemon at the host level means the read model is always live, regardless of which API surface is up. Combined with the typed routing from decision one, it means each new message kind becomes "add a projection, register it with the host, done." No daemon-orchestration ceremony to learn.
This was the lowest-glamour commit of the day and probably the one that pays the most rent.
The through-line: typed routing on a per-recipient projection, with infrastructure that follows the model and a projection lifecycle pinned to the host.
What surprised us wasn't any single decision — it was how cleanly they composed. When the kind is a type, fan-out at compose time becomes natural (you already have a typed command to dispatch). When fan-out happens at compose time, per-recipient projections become natural (the write boundary already knows the recipient). When projections are per-recipient and lifecycle-owned by the host, the read API becomes natural (you query the recipient's stream). And when the model owns its events, Redis stops being a model concern and becomes plumbing.
The decisions reinforce each other. That's what made a day's work cohesive instead of a hairball of accidental coupling.
A small footnote from the same day: auth now preserves your active tenant on a token refresh. Different domain, same instinct — don't make people pay for invisible architectural shortcuts. If the refresh path can know the active tenant, it should; making the user re-pick is the kind of UX bug that lives forever once it ships.
A foundation built around the right primitives gets built quickly, because the primitives are the build. Everything else is just typing.
How do you draw the line between "the broker's job" and "the domain's job" in your real-time layers — and what's the moment you knew you'd drawn it in the wrong place?
Tech Used:
- Marten — A .NET library (from JasperFx) that turns PostgreSQL into a document database and an event store. You write strongly typed C# aggregates and projections; Marten stores them as JSONB documents in Postgres, and the event-sourcing side runs a "projection daemon" that builds read models from streams of events. It's the persistence + read-model layer in the foundation above.
- Wolverine — A .NET messaging / command-mediator framework (same JasperFx team). It handles commands, events, sagas, message routing, retries, transactional outbox, and durable inbox/outbox patterns. Roughly: MediatR + MassTransit, but tighter integration with Marten so commands and event streams share one transaction. In the post, it's where the command handlers and event publishing live.
- Redis — An in-memory key/value data store. Often used for caching, but its other classic uses are distributed coordination (locks, ephemeral registries) and pub/sub (a publisher writes to a channel, any subscribed node receives the message). In the post, Redis holds the connection registry (which node owns which user's WebSocket) and the pub/sub channels that let any node deliver a message to a socket held by any other node.
In short: Marten = state 'plus' events in Postgres. Wolverine = message handlers 'plus' sagas in .NET. Redis = distributed transport / lookup.
