When I hear “this monolith has gotten too big, let’s split it into microservices” in a planning meeting, my first question is always the same: which measurement is telling you that?
The answer is usually not a measurement. It’s a feeling — “the codebase is huge” —, a blog post, or a résumé anxiety. All three are real feelings, but none of them is an architectural reason.
In the modular monolith piece I argued for drawing boundaries early and deploying separately late. This piece is about when that “late” arrives — and how to tell when it hasn’t.
A microservice is not a solution, it’s a trade
A microservice architecture changes one thing at its core: it turns a method call between modules into a network call. What you get is independent deployment and independent scaling. What you pay in return is a long list:
- Every call that was free in-process now means serialization, latency, and partial failure.
- Instead of a single DB transaction: sagas, outbox, eventual consistency.
- Instead of a single log stream: a request path you can’t follow without distributed tracing.
- Instead of a single deploy: service contracts that have to be versioned and kept backward-compatible.
- Instead of a refactor the compiler verifies, a network contract no one verifies.
None of these is “bad”; they’re all costs that are genuinely paid in real systems. The real question is this: are you paying that cost in exchange for something, or up front?
Reasons that don’t justify the move
The reasons that most often trigger the microservice decision are, unfortunately, the weakest ones.
“The codebase is too big.” The solution to a big codebase is module boundaries, honest file organization, and dead-code cleanup — not separate deployment. A 200k-line monolith and 200k lines split across 20 services are the same amount of code; the second just put a network in the middle.
“Deploys are slow and scary.” The fix for a slow pipeline is to speed up the pipeline. The fix for scary deploys is test coverage, staged rollout, and fast rollback. None of these requires splitting the system; splitting it actually increases the number of deploys.
“I need to scale independently.” A monolith also scales horizontally: running N copies of the same application behind a load balancer doesn’t require microservices. Independent scaling only matters when the resource profiles of the modules genuinely diverge — which is a separate point, below.
“This is the modern architecture.” Modernity is not an architectural reason. Whether a system is good is told not by how many pieces it’s split into, but by whether it makes a failure observable at 2 a.m.
“Netflix does it this way.” Netflix has a scale that funds microservices and hundreds of engineers. When a three-person team copies that architecture, it copies not the problem Netflix solved but only the operational burden. This is exactly the innovation-token logic from the boring architecture piece: spending the budget in the wrong place.
The common failure pattern: these are all real discomforts, but none of them is cured by microservices. The right disease, the wrong medicine.
The measured signals that justify the move
In the modular monolith piece I listed four signals for extracting a module into a service. At the system scale the threshold is the same — at least one of these signals must be measured:
-
The scaling profile is diverging. One piece consumes resources completely differently from the rest: a report/PDF engine that constantly saturates the CPU, or a notification module taking tens of thousands of lightweight calls per minute. Keeping the two in a single deployment unit condemns one to the other’s capacity plan. If you can show this with metrics, separating it comes cheaper than horizontal scaling.
-
Ownership has diverged organizationally. The number of teams has grown, and release contention on the same deploy pipeline is now a measurable delay. If three teams start waiting on each other’s deploys every day, the boundary is no longer technical but organizational — Conway’s law is sending the bill.
-
A different runtime is genuinely required. One workload is expensive in PHP and noticeably cheap in Go or Rust — and this is not a “would be faster” hypothesis but a measured bottleneck. Running that piece as a separate service in a different language is justified.
-
Fault isolation is mandatory and can’t be guaranteed in-process. One module crashing really must not bring down the rest of the system, and you can’t ensure that within a single process.
What the four have in common: they are all measured. No sentence beginning with “I think,” “later,” or “maybe” is on this list. If a signal isn’t visible on a graph, it isn’t a signal yet.
Not all of it, one module
Even when the signal arrives, the right move isn’t “split the monolith into microservices.” The right move is to extract the single module that produced the signal into a service.
If you started with a modular monolith, this is already mechanical work: the boundary is clear, the public surface is a single class. You pull that module out with the strangler pattern — the new service comes up, calls are gradually routed to it, the old code is deleted. The rest of the system stays a happy monolith.
A big-bang rewrite — splitting everything into services at once — is the most expensive and most frequently failed form of migration known. Separate a system piece by piece as it signals, not all at once.
When does the monolith turn into the wrong answer?
The monolith is the right answer as long as none of the signals above is measured. The moment it turns into the wrong answer is also clear: when multiple teams are jammed into the same deploy unit, when one piece’s resource profile holds the rest of the system hostage, or when one component’s collapse regularly takes down the whole system.
When that day comes, moving to microservices is not a defeat — it’s a planned step. And that is the whole value of having drawn the boundaries from the start: the move becomes not a frightening migration but a decision whose turn has come.
A microservice is not a goal, it’s a bill. You should have a reason to pay that bill — and that reason should show up on a graph, not in a blog post.
Wait for the signal; let the decision come from measurement, not fashion.