Splitting an app into microservices can allow teams to iterate quickly on features without blocking each other. But microservices bring new challenges, such as committing consistent updates across multiple services.
In a distributed world, *preventing* failure is not an option. Instead, we need strategies for *mitigating* failure. In today’s episode, guest chef Andrzej Krzywda joins us to demonstrate one such strategy: the *Saga* pattern. Enjoy!
Video transcript & code
It's very typical for any application to start with a simple model layer. Just some basic create, read, update, delete actions. For instance, here’s a controller action to create a new holiday directly from some request parameters.
def create Holiday.create(params) redirect_to holidays_path end
Later, some user requests may create more than one database record, so we use some kind of service objects to wrap such scenarios. By “service object”, we mean an object that exists to encapsulate a sequential procedure.
class HolidayService def call Holiday.create(params) Calendar.update(params) end end
For instance, here we have a
HolidayService that packages holiday creation and a calendar update into a single
It's usually at this level, that we wrap the code with transaction.
class HolidayService def call DB.Transaction.new do Holiday.create(params) Calendar.update(params) end end end
Problems with distributed transactions
Over time, more requirements arrive. Sometimes, it's an integration with an 3rd party service, using API requests.
In recent years, it's become popular to extract responsibility for certain of our businesss entities into their own microservices. This means we need to call the newly extracted microservices.
What was previously only a local db transaction, now becomes a distributed transaction.
You can't deal with distributed transaction the same way as a local transaction - there's no "magic flag" to make distributed transactions just work.
class HolidayService #one big transaction def call DB.Transaction.new do Http.post(ENV['HOTEL_URL'], hotel_params) Http.post(ENV['FLIGHT_URL'], flight_params) Http.post(ENV['CAR_URL'], car_params) Holiday.create(hotel_params, flight_params, car_params) end end end
Inconsistencies - lack of transactions
Because of these complexities, some teams opt to simply avoid transactions. They decide it's acceptable to have some inconsistencies… and then "fix" them later with some nightly/hourly scripts which bring the consistency back. It's rare, though, that business can accept this as a normal behaviour.
class HolidayService_2 #no transaction def call Http.post(ENV['HOTEL_URL'], hotel_params) Http.post(ENV['FLIGHT_URL'], flight_params) Http.post(ENV['CAR_URL'], car_params) Holiday.create(hotel_params, flight_params, car_params) end end
The saga pattern - introduction
The saga pattern aims to solve those problems.
The concept was first published in 1987 by Hector Garcia-Molina and Kenneth Salem in the "Sagas" paper.
In this paper, they define the concept of LLT - long lived transactions. In short, they conclude that there is no solution to the problems of LLTs. However, they suggest that:
"for specific applications it may be possible to alleviate the problems by relaxing the requirement that an LLT be executed as an atomic action. In other words, without sacrificing the consistency of the database, it may be possible for certain LLTs to release their resources before they complete, thus permitting other waiting transactions to proceed."
We can define the term saga as a LLT that can be broken up into a collection of subtransactions that can be interleaved in any way with other transactions. Each saga transaction should be provided with a compensating transaction. The compensating transaction undoes, from a semantic point of view, any of the actions performed by the actual transaction.
Let’s look at an example. Here we have a new
Holiday class. It is initialized with a hotel, a flight, and a rental car.
book method is where the action is. In order to book a holiday, the hotel, flight, and car each need to be booked. All of these invoke external services, and the requests might fail.
If the hotel booking fails, then we don’t need to perform any cleanup at all.
If flight booking fails, we need to compensate for the inconsistency. We go back and cancel any previous steps—which at this point is just the hotel. Each of the
cancel methods is another small transaction + API call.
In the pessimistic scenario, we book the hotel, we book the flight, but we fail at booking the car. Now we need to compensate for two steps: the hotel and flight bookings. We have to cancel them both.
class Holiday def initialize(hotel, flight, car) @hotel = hotel @flight = flight @car = car end def book @hotel. book @flight.book @car. book rescue HotelBookingFailed rescue FlightBookingFailed @hotel. cancel rescue CarNotAvailable @flight.cancel @hotel. cancel end end
This is the Saga pattern in a nutshell.
In conclusion: There is no such thing as a distributed transaction. But sometimes we find a need to make our code act as if they exist. One way to do this is with Sagas: a way to package a series of steps along with the compensating actions that will be needed if one of the steps fails.
Sometimes the compensation might be as straightforward as canceling or rolling-back a transaction. Other actions, like sending an email, can’t be “taken back”, and we need a different sort of compensation… like sending a new email that says “please ignore that last email”! The point is that we embrace the possibility of failures, and make plans for how to address them.