In Progress
Unit 1, Lesson 21
In Progress

More Coincidental Duplication

When should you extract extract out apparently duplicated logic? Do you factor out every bit of apparent repetition as soon as you see it? Or do you adhere to the old adage, “if it’s not broken, don’t fix it”? In this, our third foray into the concept of “coincidental duplication”, you’ll learn a simple and effective guideline for when to extract duplicated code.

Video transcript & code

The topic of coincidental duplication is one we've tackled a couple of times in the past. But I've never been completely satisfied with the examples I found to demonstrate it. Recently Bryce Senz heard about me talking about this subject on Ruby Rogues, and he sent in a perfect example. Today I'd like to share it with you.

Say we are writing code for a business that accepts payments in various forms. We have a method called process_payment that does exactly what its name suggests.

It receives a payment object as a parameter. It first saves a record of the payment. Then it checks to see if the payment is using a credit card. If so, it triggers code to interact with a credit card processor service. Likewise, it also has special processing for bank charges. Cash charges, on the other hand, have no external processor to be invoked.

Once the charge is processed, the method checks to see if it was approved. If so, it runs through a checklist of actions: send a receipt, do accounting, do reporting, and send any internal notifications to other interested objects.

def process_payment(payment)
  save_payment_record(payment)
  if payment.card_payment?
    process_payment_with_card_processor(payment)
  elsif payment.bank_payment?
    process_payment_with_bank_process(payment)
  else
    # no need to process anything for cash, money order, etc.
  end

  if payment.approved?
    send_receipt(payment)
    do_accounting(payment)
    do_reporting(payment)
    do_internal_notifications(payment)
  end
end

The shape of this code, and particularly the way it switches on the type of the transaction, suggests a classic refactoring. We decide to replace the conditional statement with polymorphism, by pushing functionality down into the payment object, and breaking payments into three types: card payments, bank payments, and cash payments.

Each payment type has its own process method that performs the appropriate actions for that type of transaction.

class CardPayment
  def process
    save_record
    process_with_card_processor
    send_receipt
    if approved?
      do_accounting
      do_reporting
      do_internal_notifications
    end
  end
end

class BankPayment
  def process
    save_record
    process_with_bank_processor
    send_receipt
    if approved?
      do_accounting
      do_reporting
      do_internal_notifications
    end
  end
end

class CashPayment
  def process
    save_record
    send_receipt
    if approved?
      do_accounting
      do_reporting
      do_internal_notifications
    end
  end
end

As we are completing this refactoring, we notice something else: the last several lines of each of our #process methods are identical.

We decide to do something about that. We add a superclass for all of the payment types called Payment. We give this class responsibility for receiving the initial #process message. We make it into a template method: it delegates to a specialized #do_process method for the type-specific processing. Then it performs the post-processing steps that are common to all payments.

This version appeals to our sense separation of concerns, because we've pushed out just the parts that differ into subclasses, and kept the parts that stay the same in one place.

class Payment
  def process
    save_record
    do_process
    if approved?
      send_receipt
      do_accounting
      do_reporting
      do_internal_notifications
    end
  end
end

class CardPayment < Payment
  def do_process
    process_with_card_processor
  end
end

class BankPayment
  def do_process
    process_with_bank_processor
  end
end

class CashPayment
  def do_process
    # NOOP
  end
end

But as time goes by, we start to run into some trouble with this design. First our lawyer comes along and tells us that we have to send different types of receipts for credit, bank, and cash transactions. This is part of the payment post-processing that we standardized in the superclass, so we add a new conditional to that section.

Then we get a requirement that for bank transactions, we need to queue up a job for five days later, confirming that the check cleared.

Next we realize that if the customer has recurring payments set up and a payment fails, we need to cancel the recurring payments. This isn't applicable for cash payments, so we have to add a check for that as well.

class Payment
  def process
    save_record
    do_process
    if approved?
      if card_payment?
        send_card_receipt
      elsif bank_payment?
        send_bank_receipt?
        queue_job_to_confirm_check_not_bounced_in_five_days
      else
        send_cash_receipt?
      end
      do_accounting
      do_reporting
      do_internal_notifications
    elsif declined? && !cash_payment? && recurring?
      disable_recurring_payments
    end
  end
end

class CardPayment < Payment
  def do_process
    process_with_card_processor
  end
end

class BankPayment
  def do_process
    process_with_bank_processor
  end
end

class CashPayment
  def do_process
    # NOOP
  end
end

This code is no longer nearly as satisfactory. We could tackle this proliferation of conditionals by breaking the top-level template method down into ever finer-grained polymorphic steps. But at that point we have to ask ourselves if we're still communicating business processes clearly, or if we're just endlessly factoring to avoid conditionals.

class Payment
  def process
    save_record
    do_process
    if approved?
      do_send_receipt
      do_job_queueing
      do_accounting
      do_reporting
      do_internal_notifications
    elsif declined? && recurring?
      do_decline_handling
    end
  end
end

Instead, let's back up to when we first broke payments apart into separate types. At that point, we had what looked like a lot of duplication from one #process method to another. What we didn't realize at the time was that this was coincidental duplication. The fact that the latter half of payment processing happened to be the same for all payment types at that point in time was a coincidence. It was not an essential feature of the problem domain. As we saw later on, this duplication gradually broke down as we improved our understanding of the business needs.

class CardPayment
  def process
    save_record
    process_with_card_processor
    if approved?
      send_receipt
      do_accounting
      do_reporting
      do_internal_notifications
    end
  end
end

class BankPayment
  def process
    save_record
    process_with_bank_processor
    if approved?
      send_receipt
      do_accounting
      do_reporting
      do_internal_notifications
    end
  end
end

class CashPayment
  def process
    save_record
    if approved?
      send_receipt
      do_accounting
      do_reporting
      do_internal_notifications
    end
  end
end

So what should we have done here? Is there some crystal ball we should have consulted in order to find out whether this was coincidental duplication?

The answer is no, there is no crystal ball. What caused our trouble was that we were over-eager to refactor when we saw what appeared to be duplication.

How could we have avoided this? The key is in what circumstances we used as a cue for refactoring. Instead of saying, "this code looks the same, let's factor it out", we could have held off and allowed the apparent duplication.

But then when do we go ahead and factor out duplication? The trick is to wait until we are about to make a change, and we find ourselves making a mental note: "if we make this change in class A, we need to remember to also make it in class B and C".

As soon as we catch ourselves making this mental note, we should stop, factor out the duplication, and then make the change in one place. But no sooner. The "don't repeat yourself" principle is all about duplicated knowledge, not duplicated code. Until we find ourselves wanting to make the same change to two different parts of the code, we shouldn't assume that similar code represents the same piece of knowledge. The rule set for cash payment is a different piece of knowledge than the rule set for credit card payment, even if they happen to look the same at one point in development.

When we distribute our latter-day understanding across our various payment classes, we get code that looks like this. Parts of it are repeated, and parts of it differ. We still might find that we want to extract similar segments of it out into common, shared methods at some point down the road. But we should wait to those refactorings until the need to make repetitive changes drives us. We needn't jump the gun and factor out every bit of apparent repetition as soon as we see it.

class CardPayment
  def process
    save_record
    process_with_card_processor
    if approved?
      send_card_receipt
      do_accounting
      do_reporting
      do_internal_notifications
    elsif payment.declined? && payment.recurring?
      disable_recurring_payments
    end
  end
end

class BankPayment
  def process
    save_record
    process_with_bank_processor
    if approved?
      send_bank_receipt
      queue_job_to_confirm_check_not_bounced_in_five_days
      do_accounting
      do_reporting
      do_internal_notifications
    elsif payment.declined? && payment.recurring?
      disable_recurring_payments
    end
  end
end

class CashPayment
  def process
    save_record
    if approved?
      send_cash_receipt
      do_accounting
      do_reporting
      do_internal_notifications
    end
  end
end

And that's it for today. Thanks again to Bryce Senz for supplying the inspiration for this example. Happy hacking!

Responses