Mocking Smells 1
Video transcript & code
If you've watched any of my episodes in which test-driven development plays a role, you've probably gathered that I tend toward the "mockist" school of unit testing. I like to design my classes with the help of tests and test double objects that play the role of collaborators. Often, they stand in for objects that I haven't even created yet.
Mockist testing is the source of some controversy in the Ruby ecosystem, and not without reason. Stubs and mocks are one of those tools that, once learned, are very easy to over-use. I've definitely been guilty of over-using them in the past. It's all too easy to use stubs and mocks to construct a towering edifice of object fantasies, one which verifies nothing of interest about the system under test and yet still breaks every time the system is modified.
I thought today we might go through some stub and mock smells, and talk about how to avoid them. Remember as we do so that a "code smell" is not a guarantee of something wrong; rather, it's a hint based on past experience that something might be wrong.
Here's a test that checks the result of a calculation. It is testing a
Task object. As part of the test setup, it stubs out a method on the object being tested.
RSpec.describe Task do # ... it "can determine urgency" do t = Task.new(priority: 1) allow(t).to receive(:urgency_threshold).and_return(0) expect(t).to be_urgent allow(t).to receive(:urgency_threshold).and_return(1) expect(t).to be_urgent allow(t).to receive(:urgency_threshold).and_return(2) expect(t).to_not be_urgent end # ... end
This test now has "secret knowledge" about the implementation of the object it is testing. Instead of just checking outcomes, it is also interfering with the inner workings of the object.
Even in the best case, this test is at risk of falling out of step with the actual implementation of the object. If the name of the method being stubbed out changes, or if it is inlined, or if its job is factored out to another object, the test will fail for reasons that have nothing to do with the object violating its contract.
More insidiously, stubbing methods on the object under test risks turning the tests into tautologies—tests that that pass only because we've forced them to pass.
In this case, it looks like someone has stubbed out a method on the object under test because it interacts with a global source of configuration. There are a number of ways we could change this test to avoid stubbing the object under test. One way is to inject a settings object at instantiation.
We make a
Task's settings injectable by adding an initializer keyword and an instance variable. The default for the keyword is the
Settings module. In the
urgency_threshold method, we substitute the instance variable for the direct reference to the module.
Then in the test we add a new test double to stand in for the
Settings module. We pass it in to the
Task, and set all of our stubs on the double instead of the actual
Settings module. By making this change, we've decoupled the test from the implementation of the
class Task def initialize(project: nil, priority: 1, settings: Settings) @project = project @priority = priority @settings = settings end # ... def urgency_threshold @settings.urgency_threshold end # ... end RSpec.describe Task do # ... it "can determine urgency" do settings = double t = Task.new(priority: 1, settings: settings) allow(settings).to receive(:urgency_threshold).and_return(0) expect(t).to be_urgent allow(settings).to receive(:urgency_threshold).and_return(1) expect(t).to be_urgent allow(settings).to receive(:urgency_threshold).and_return(2) expect(t).to_not be_urgent end # ... end
Moving on, here's another test. This one doesn't stub a method on the object under test. Instead, it sets a message expectation on the object being tested.
# ... it "notifies its project of completion" do t = Task.new expect(t).to receive(:notify_observers) t.mark_completed end # ...
This is another kind of smell. The only reason to set a message expectation in a test is to verify that the object being tested interacts with the world around it in the way the designer intended. Showing that an object interacts with itself establishes nothing of interest; but it does couple the test tightly to the internals of the object.
Looking at the source code for the object being tested, it looks like the interaction that we actually care about is one where the task's
project is sent a notification of the task being completed.
The solution here is to set expectations, not on the object being tested, but on the objects it collaborates with. We add a new test double to stand in for the
Project, pass it in to the
Task constructor, and assert that it is notified when the task is marked complete.
# ... it "notifies its project of completion" do project = double t = Task.new(project: project) expect(project).to receive(:notify_task_completed).with(t) t.mark_completed end # ...
There's a bonus here, too: because we are making an assertion on the project collaborator, we are able to also verify that the task correctly passes itself to the project object as part of the notification. Previously we weren't testing that logic at all.
Our last example today is a test that sets up a stub for a collaborator object. Unfortunately, the collaborators being simulated here are from a third-party library.
# ... it "posts message to social media on completion" do t = Task.new expect(Faraday).to receive(:post).with("http://example.com/", message: "Task completed!") t.mark_completed end # ...
Newcomers to mocking and stubbing frameworks often seize upon them on as a way to decouple tests from slow external services. Unfortunately, this often turns out to be a trap in the long run. The first problem is that it is notoriously hard to perfectly simulate a third-party API. Here, we place an expectation on a
Faraday module method being called. But we don't bother simulating the response object this would normally return. If our code one day starts caring about the response, we may have to go back and change a whole bunch of mocking code.
One common way to use Faraday responses is to chain
.on_complete callbacks onto the response object. Allowing for this possibility makes for some elaborate, and hard to follow, stubbing.
Here's a version of the code that uses an
.on_complete handler to set a
@bragged flag to
true if the response is a success. The stubs we set up to simulate this are complicated and fragile.
We also have to update the global stubbing of
Faraday in order to keep all other tests passing. This binds our tests even more tightly to a very specific implementation.
before do allow(Faraday).to receive(:post).and_return(double.as_null_object) end
Then, once we have all of our tests passing again, we upgrade to a new version of Faraday and discover that the API has changed and we have to rewrite all of our stubs and mocks.
When the originators of mock objects wrote up their learnings after five years of experience, in a paper called "Mock Roles, Not Objects", they put one lesson front and center: only mock types you own.
To accomplish this, we introduce a new role: an object to represent a single social media
brag. We make it an optional parameter to the
mark_completed method. Inside that method, we use a simplified bragging interface that we control to interact with the outside world.
We are able to substantially simplify our test doubles after introducing this new collaborator.
We are also able to get rid of the global stubbing of
Faraday. In order to get all tests passing again, we just need to update the tests that actually exercise the
#mark_completed method to pass in a null
brag so that it doesn't try to hit the network.
The key practice here is to separate our business logic from thin "adapter" or "gateway" objects which encapsulate third-party libraries. The business logic is then easy to test with the adapters mocked out. Meanwhile the adapters should not be tested using test doubles at all; instead, integration tests should verify the adapters against the actual third-party libraries and external services.
We've looked at just a few mock object smells today. There are many more worth understanding, but we'll save those for future episodes.
If any of the smells we looked at today rang a bell, but the instances of them in your own projects don't seem as straightforward to fix as the ones in today's examples, feel free to get in touch. I'd be happy to try and tackle some more complex examples in a future installment.
That's it for now. Happy hacking!