More Coincidental Duplication
When should you extract extract out apparently duplicated logic? Do you factor out every bit of apparent repetition as soon as you see it? Or do you adhere to the old adage, “if it’s not broken, don’t fix it”? In this, our third foray into the concept of “coincidental duplication”, you’ll learn a simple and effective guideline for when to extract duplicated code.
Video transcript & code
The topic of coincidental duplication is one we've tackled a couple of times in the past. But I've never been completely satisfied with the examples I found to demonstrate it. Recently Bryce Senz heard about me talking about this subject on Ruby Rogues, and he sent in a perfect example. Today I'd like to share it with you.
Say we are writing code for a business that accepts payments in various forms. We have a method called
process_payment that does exactly what its name suggests.
It receives a payment object as a parameter. It first saves a record of the payment. Then it checks to see if the payment is using a credit card. If so, it triggers code to interact with a credit card processor service. Likewise, it also has special processing for bank charges. Cash charges, on the other hand, have no external processor to be invoked.
Once the charge is processed, the method checks to see if it was approved. If so, it runs through a checklist of actions: send a receipt, do accounting, do reporting, and send any internal notifications to other interested objects.
def process_payment(payment) save_payment_record(payment) if payment.card_payment? process_payment_with_card_processor(payment) elsif payment.bank_payment? process_payment_with_bank_process(payment) else # no need to process anything for cash, money order, etc. end if payment.approved? send_receipt(payment) do_accounting(payment) do_reporting(payment) do_internal_notifications(payment) end end
The shape of this code, and particularly the way it switches on the type of the transaction, suggests a classic refactoring. We decide to replace the conditional statement with polymorphism, by pushing functionality down into the payment object, and breaking payments into three types: card payments, bank payments, and cash payments.
Each payment type has its own process method that performs the appropriate actions for that type of transaction.
class CardPayment def process save_record process_with_card_processor send_receipt if approved? do_accounting do_reporting do_internal_notifications end end end class BankPayment def process save_record process_with_bank_processor send_receipt if approved? do_accounting do_reporting do_internal_notifications end end end class CashPayment def process save_record send_receipt if approved? do_accounting do_reporting do_internal_notifications end end end
As we are completing this refactoring, we notice something else: the last several lines of each of our
#process methods are identical.
We decide to do something about that. We add a superclass for all of the payment types called
Payment. We give this class responsibility for receiving the initial
#process message. We make it into a template method: it delegates to a specialized
#do_process method for the type-specific processing. Then it performs the post-processing steps that are common to all payments.
This version appeals to our sense separation of concerns, because we've pushed out just the parts that differ into subclasses, and kept the parts that stay the same in one place.
class Payment def process save_record do_process if approved? send_receipt do_accounting do_reporting do_internal_notifications end end end class CardPayment < Payment def do_process process_with_card_processor end end class BankPayment def do_process process_with_bank_processor end end class CashPayment def do_process # NOOP end end
But as time goes by, we start to run into some trouble with this design. First our lawyer comes along and tells us that we have to send different types of receipts for credit, bank, and cash transactions. This is part of the payment post-processing that we standardized in the superclass, so we add a new conditional to that section.
Then we get a requirement that for bank transactions, we need to queue up a job for five days later, confirming that the check cleared.
Next we realize that if the customer has recurring payments set up and a payment fails, we need to cancel the recurring payments. This isn't applicable for cash payments, so we have to add a check for that as well.
class Payment def process save_record do_process if approved? if card_payment? send_card_receipt elsif bank_payment? send_bank_receipt? queue_job_to_confirm_check_not_bounced_in_five_days else send_cash_receipt? end do_accounting do_reporting do_internal_notifications elsif declined? && !cash_payment? && recurring? disable_recurring_payments end end end class CardPayment < Payment def do_process process_with_card_processor end end class BankPayment def do_process process_with_bank_processor end end class CashPayment def do_process # NOOP end end
This code is no longer nearly as satisfactory. We could tackle this proliferation of conditionals by breaking the top-level template method down into ever finer-grained polymorphic steps. But at that point we have to ask ourselves if we're still communicating business processes clearly, or if we're just endlessly factoring to avoid conditionals.
class Payment def process save_record do_process if approved? do_send_receipt do_job_queueing do_accounting do_reporting do_internal_notifications elsif declined? && recurring? do_decline_handling end end end
Instead, let's back up to when we first broke payments apart into separate types. At that point, we had what looked like a lot of duplication from one
#process method to another. What we didn't realize at the time was that this was coincidental duplication. The fact that the latter half of payment processing happened to be the same for all payment types at that point in time was a coincidence. It was not an essential feature of the problem domain. As we saw later on, this duplication gradually broke down as we improved our understanding of the business needs.
class CardPayment def process save_record process_with_card_processor if approved? send_receipt do_accounting do_reporting do_internal_notifications end end end class BankPayment def process save_record process_with_bank_processor if approved? send_receipt do_accounting do_reporting do_internal_notifications end end end class CashPayment def process save_record if approved? send_receipt do_accounting do_reporting do_internal_notifications end end end
So what should we have done here? Is there some crystal ball we should have consulted in order to find out whether this was coincidental duplication?
The answer is no, there is no crystal ball. What caused our trouble was that we were over-eager to refactor when we saw what appeared to be duplication.
How could we have avoided this? The key is in what circumstances we used as a cue for refactoring. Instead of saying, "this code looks the same, let's factor it out", we could have held off and allowed the apparent duplication.
But then when do we go ahead and factor out duplication? The trick is to wait until we are about to make a change, and we find ourselves making a mental note: "if we make this change in class A, we need to remember to also make it in class B and C".
As soon as we catch ourselves making this mental note, we should stop, factor out the duplication, and then make the change in one place. But no sooner. The "don't repeat yourself" principle is all about duplicated knowledge, not duplicated code. Until we find ourselves wanting to make the same change to two different parts of the code, we shouldn't assume that similar code represents the same piece of knowledge. The rule set for cash payment is a different piece of knowledge than the rule set for credit card payment, even if they happen to look the same at one point in development.
When we distribute our latter-day understanding across our various payment classes, we get code that looks like this. Parts of it are repeated, and parts of it differ. We still might find that we want to extract similar segments of it out into common, shared methods at some point down the road. But we should wait to those refactorings until the need to make repetitive changes drives us. We needn't jump the gun and factor out every bit of apparent repetition as soon as we see it.
class CardPayment def process save_record process_with_card_processor if approved? send_card_receipt do_accounting do_reporting do_internal_notifications elsif payment.declined? && payment.recurring? disable_recurring_payments end end end class BankPayment def process save_record process_with_bank_processor if approved? send_bank_receipt queue_job_to_confirm_check_not_bounced_in_five_days do_accounting do_reporting do_internal_notifications elsif payment.declined? && payment.recurring? disable_recurring_payments end end end class CashPayment def process save_record if approved? send_cash_receipt do_accounting do_reporting do_internal_notifications end end end
And that's it for today. Thanks again to Bryce Senz for supplying the inspiration for this example. Happy hacking!