I’m talking about user interactions, not deployments.
In a monolith with a transactional data store, you can have a nice and clean atomic state transition from one complete, valid state to the next in a single request/response.
With a distributed system, you’ll often have scenarios where the component which receives the initial request can’t guarantee the final state of the system by the time it needs to produce a response.
If it did, it would spend most of its effort orchestrating other components. That would couple them together and be no more useful than a monolith, just with new and exciting failure modes. So really the best it can do is tell the client “Here’s a token you can use to check back on the state of this operation later”.
And because data is often partitioned between different services, you can end up having partially-applied state changes. This leaves the data in an otherwise-invalid state, which must be accounted for – simply because of an implementation detail, not because it’s semantically meaningful to the client.
In operations that have irreversible or non-idempotent external side-effects, this can be especially difficult to manage. You may want to allow the client to resume from immediately before or after the side-effect if there is a failure later on. Or you may want to schedule the side-effect, from the perspective of an earlier component in the chain, so that it happens even if a middle component fails (like the equivalent of a catch or finally block).
If you try to cut corners by representing these things as special cases where the later components send data back to earlier ones, you end up introducing cycles in the data flow of your microservices. And then you’re in for a world of hurt. It’s better if you can represent it as a finite state machine, from the perspective of some coordinator component that’s not part of the data flow itself. But that’s a ton of work.
It complicates every service that deals with it, and it gets really messy to just manage the data stores to track the state. And if you have queues and batching and throttling and everything else, along with granular permissions… Things can break. And they can break in really horrible ways, like infinitely sending the same data to an external service because the components keep tossing an event back to each other.
There are general patterns – like sagas, distributed transactions, and event-sourcing – which can… kind of ease this problem. But they’re fundamentally limited by the CAP Theorem. And there isn’t a universally-accepted clean way to implement them, so you’re pretty much doing it from scratch each time.
Don’t get me wrong. Sometimes “Here’s a token to check back later” and modeling interactions as a finite state machine rather than an all-or-nothing is the right call. Some interactions should work that way. But you should build them that way on purpose, not to work around the downsides of a cool buzzword you decided to play around with.
I struggle to think of any code with real-world applications that doesn’t have to make CAP tradeoffs. The reason that often doesn’t seem to be the case is willful ignorance.
The only way you can have ACID in a monolith is if the data store is incorporated into the RAM when the code runs and the code is not interacting with any exterior system.
Typically the data store is external though and yes you can have atomic states in the data store but that doesn’t translate to your code. That’s CAP right there, between your code and your data running on two distinct systems.
I mean I made be a novice on this but multi-state service in general sounds like a bad time. Isn’t the accepted best practice for a micro service program stateless operations and one state at most per service?
I mean its true for anything beyond a single threaded monolith right? Otherwise you just get apps that prentend to be asynchronous waiting on locks so they act totally synchronousaly.
I’m talking about user interactions, not deployments.
In a monolith with a transactional data store, you can have a nice and clean atomic state transition from one complete, valid state to the next in a single request/response.
With a distributed system, you’ll often have scenarios where the component which receives the initial request can’t guarantee the final state of the system by the time it needs to produce a response.
If it did, it would spend most of its effort orchestrating other components. That would couple them together and be no more useful than a monolith, just with new and exciting failure modes. So really the best it can do is tell the client “Here’s a token you can use to check back on the state of this operation later”.
And because data is often partitioned between different services, you can end up having partially-applied state changes. This leaves the data in an otherwise-invalid state, which must be accounted for – simply because of an implementation detail, not because it’s semantically meaningful to the client.
In operations that have irreversible or non-idempotent external side-effects, this can be especially difficult to manage. You may want to allow the client to resume from immediately before or after the side-effect if there is a failure later on. Or you may want to schedule the side-effect, from the perspective of an earlier component in the chain, so that it happens even if a middle component fails (like the equivalent of a catch or finally block).
If you try to cut corners by representing these things as special cases where the later components send data back to earlier ones, you end up introducing cycles in the data flow of your microservices. And then you’re in for a world of hurt. It’s better if you can represent it as a finite state machine, from the perspective of some coordinator component that’s not part of the data flow itself. But that’s a ton of work.
It complicates every service that deals with it, and it gets really messy to just manage the data stores to track the state. And if you have queues and batching and throttling and everything else, along with granular permissions… Things can break. And they can break in really horrible ways, like infinitely sending the same data to an external service because the components keep tossing an event back to each other.
There are general patterns – like sagas, distributed transactions, and event-sourcing – which can… kind of ease this problem. But they’re fundamentally limited by the CAP Theorem. And there isn’t a universally-accepted clean way to implement them, so you’re pretty much doing it from scratch each time.
Don’t get me wrong. Sometimes “Here’s a token to check back later” and modeling interactions as a finite state machine rather than an all-or-nothing is the right call. Some interactions should work that way. But you should build them that way on purpose, not to work around the downsides of a cool buzzword you decided to play around with.
I struggle to think of any code with real-world applications that doesn’t have to make CAP tradeoffs. The reason that often doesn’t seem to be the case is willful ignorance.
The only way you can have ACID in a monolith is if the data store is incorporated into the RAM when the code runs and the code is not interacting with any exterior system.
Typically the data store is external though and yes you can have atomic states in the data store but that doesn’t translate to your code. That’s CAP right there, between your code and your data running on two distinct systems.
I mean I made be a novice on this but multi-state service in general sounds like a bad time. Isn’t the accepted best practice for a micro service program stateless operations and one state at most per service?
I mean its true for anything beyond a single threaded monolith right? Otherwise you just get apps that prentend to be asynchronous waiting on locks so they act totally synchronousaly.