Validating the Schema at the Edge of the Queue

A worker started throwing KeyError in production. The cause wasn’t its code — an upstream service had shipped a change that dropped a field from the event’s payload, and the queue had carried the malformed message across without a word. The bug wasn’t the bad message. It was that nothing checked it at the boundary.

The queue carries bytes, not types

A request body hits a typed handler; a framework rejects it at the door if it’s the wrong shape. A queue gives you none of that. The payload is opaque JSON — the broker moves bytes and never looks inside. Whatever the producer put in, the consumer gets, valid or not. The type safety you have inside one service evaporates the moment the data crosses the wire.

So if you want a guarantee about a message’s shape, you have to add it yourself — at the edge, where data enters and where it leaves.

Validate where data enters, and where it leaves

There are two edges, and they catch different failures:

Producer-side, before publish — the primary one. Validate the payload as you produce it. Invalid data never enters the queue, the failure surfaces synchronously in the service that actually has the bug, and it’s caught once — before the message fans out to every consumer. This is the cheap place to fail.
Consumer-side, on receive — the safety net. Validate again as you consume. This catches what producer-side can’t: messages from producers you don’t control — another team, an older deployed version, a hand-crafted replay. A message that fails here is a poison message; route it to a dead-letter queue rather than letting it crash the handler in a loop.

A subtlety worth knowing: most queue runtimes have no “reject immediately” hook, so a consumer-side rejection usually retries before it dead-letters — and invalid data never becomes valid on retry. That’s wasted work, which is exactly why producer-side is where the value is. Consumer-side is the seatbelt, not the steering.

The schema lives with the message, not the envelope

The thing you validate against is a schema per message type — keyed by the event’s identity, not bolted onto the transport envelope. Keep it in a registry that versions independently, and the same schema you evolve carefully without breaking consumers becomes the schema you enforce at runtime. One contract, checked at change-time (does this edit break anyone?) and at run-time (does this message obey it?).

This is the structural cousin of making a duplicate delivery a no-op: both accept that you don’t control what arrives, and put the guarantee in your own boundary instead of trusting the sender.

When it’s not worth it

A single service’s internal queue — one producer, one consumer, deployed together — doesn’t need this. The type system already spans both ends; edge validation is ceremony. It starts paying the moment a second, independently-deployed producer or consumer exists — a different team, a different language, a different release cadence. That’s also the moment the untyped payload quietly becomes your most fragile contract.

See also: the BabelQueue schema-validation spec defines the per-URN schema and where it’s enforced, and the babelqueue-registry holds those schemas — its bqschema tool is what runs the check.

The queue carries bytes, not types

Validate where data enters, and where it leaves

The schema lives with the message, not the envelope

When it’s not worth it

Comments