In Progress
Unit 1, Lesson 1
In Progress

Self Save Part 1

Video transcript & code

Back in episode #331, as we were discussing the concept of "process objects", I showed this code:

post "/purchase_monkeys" do
  card_info = params[:card_info]
  quantity  = params[:quantity].to_i
  user      = current_user
  purchase = MonkeyPurchase.new(
    user:      user,
    card_info: card_info,
    quantity:  quantity)
  purchase.submitted
  purchase.save!
  "Your purchase is pending approval"
end

As you might recall, we were writing code for an online store called Instant Monkeys Online. Here, a MonkeyPurchase object is constructed to represent the process of purchasing monkeys.

Then, the object is informed that the purchase has officially been submitted, triggering some domain logic.

Finally, we save the new state of the MonkeyPurchase.

I received one question over and over again about this code: why did I have the controller action save the domain object? Why didn't I have it save itself, as part of the submitted method?

As it happens, there is a reason for this choice. Having an object save itself in response to business domain activities can lead to a variety of undesirable and difficult-to-diagnose problems down the road.

In order to demonstrate the issues with domain model self-saving, I searched my own memory as well as gathering the experiences of a number of other developers. From that research, I've boiled down seven scenarios that illustrate the kind of bugs that can crop up as a result of this practice.

A few caveats before we begin:

First, we have a lot of scenarios to get through, and despite my best efforts to simplify, they are all fairly involved. As a result, this is going to be the start of a miniseries, so you don't have to sit through a twenty-minute episode.

I felt that it was important to show a wide array of examples, in order to give you the clearest possible picture of the variety of ways self-saving can introduce problems. By the end of this miniseries, hopefully you'll be able to see how it's not just a matter of one or two "gotchas" to avoid. Rather, self-saving is a sign of a fundamental flaw in object design.

Second: while some of the examples I'm going to show you may seem contrived, they are all based on actual events. They are either derived from my own experiences, or from the experiences of other developers. The following examples are intended to represent the world of code as it really exists, not as it theoretically "ought" to be.

Before we get into the scenarios proper, let's quickly talk about an objection to self-saving that doesn't require any special demo code. It's this: if we let the object save itself, that implicitly means we are tying this business model to the persistence library. In all the following examples, that library will be ActiveRecord. So right from the outset, by making this choice we're forcing ourselves to load ActiveRecord and set up a database just to be able to run our unit tests.

A lot has already been written about the value of code that can be isolated from database and framework dependencies, especially in the context of testing. I'm not going to belabor that point now. I just wanted to point out that just by having these objects save themselves, we're choosing to forgo the advantages of minimal dependencies and ultra-fast tests.

Now let's move on and talk about some of the less obvious objections. We'll start with another testing-related scenario.

Here's a new version of the MonkeyPurchase process object. It derives from ActiveRecord::Base.

I've overridden the save! method to make it possible to enable or disable saves with an environment switch, without changing the code. This is only there for the purpose of making it easier to quickly switch between self-saving and non-self-saving versions in these examples. It's not something you would find in real-world code.

The class contains methods representing a series of events that may happen over the lifetime of a monkey purchase process. For the sake of example, I haven't included any domain logic in any of these event handlers. Instead, each one just updates the object's state, and then saves itself.

require "active_record"

class MonkeyPurchase < ActiveRecord::Base
  establish_connection :adapter  => "sqlite3",
                       :database => ENV.fetch("DB") { ":memory:" }

  connection.create_table( :monkey_purchases ) do |t|
    t.integer    :quantity
    t.string     :customer
    t.string     :state
    t.timestamps null: false
  end

  def save!
    if ENV["SELFSAVE"] == "YES"
      super
    else
      # NOOP
    end
  end

  def submitted
    self.state = "awaiting_waiver"
    save!
  end

  def waived
    self.state = "awaiting_approval"
    save!
  end

  def approved
    self.state = "shipping"
    save!
  end

  def shipped
    self.state = "complete"
    save!
  end
end

Here's an RSpec test scenario for the MonkeyPurchase class.

In order to simulate a large, established test suite without actually writing one, I've enclosed one example inside a loop that repeats a thousand time.

Like many real-world unit tests, there's some common setup that happens in all of the examples. In this case, it consists of making a new MonkeyPurchase and then stepping it through several events.

require "./monkey_purchase"
RSpec.describe MonkeyPurchase do
  1000.times do |n|
    it "does thing ##{n}" do
      mp = MonkeyPurchase.new
      mp.submitted
      mp.waived
      mp.approved
      mp.shipped
    end
  end
end

Let's run this spec file.

rspec perf_spec.rb

1,000 examples in about half a second. Not too bad!

Now let's run it again, only this time allowing the object to save itself internally after each state transition.

SELFSAVE=YES rspec perf_spec.rb

Five seconds: by running normally, instead of with database saving artificially disabled, our test run time has increased by an order of magnitude.

But so far, we've just been running with a sqlite in-memory database. In most Rails apps, the tests are set up to use a local database file rather than an in-memory store. Let's try it with a database file instead.

DB=test.db SELFSAVE=YES rspec perf_spec.rb
rm test.db

This time it takes 25 seconds to finish the test suite. All because each test had multiple database saves as a side-effect. Even though these tests didn't even need saved data at all in order to verify the behavior of the business logic.

This is the kind of gradually accumulated test slowdown which afflicts a lot of Rails application test suites. Left unchecked, the almost inevitable result is a test suite that is rarely run, and which as a result may not be much of a trusted asset for the team.

So now we've seen our first example of how domain models which save themselves can cause trouble. By deciding they know better than client objects when they should be persisted, they've complicated our lives when we set out to test them. We have to either come up with a way of nullifying the database in the context of these tests, or we have to deal with the long-running test suites.

Over the next few episodes, we'll go through a half-dozen more scenarios, all illustrating different and unexpected ways that a self-saving object can come back to bite us. Until the next installment, happy hacking!

Responses