In Progress
Unit 1, Lesson 1
In Progress

Intention Revealing Message

Video transcript & code

In episode 100 I wrote a method called #content_post_list which scraped the DPD website for a list of RubyTapas episodes. The final implementation of this method turned out to be quite large, and I'd like to refactor it a bit for that reason alone. But there's one line of the method that bothers me even more than the overall size of it.

module DPD
  class ContentPostGateway
    def initialize(login, password)
      @login    = login
      @password = password
    end

    def content_post_list
      agent         = Mechanize.new
      login_page    = agent.get('https://getdpd.com/login')
      form          = login_page.form_with(action: '/login')
      form.username = @login
      form.password = @password
      home_page     = agent.submit(form)
      unless home_page.title =~ /^Dashboard/
        raise "DPD admin session login failed for user #{login}"
      end
      list_page = agent.get('https://getdpd.com/plan')
      content_post_table = list_page.search('table').detect { |t|
        headings = t.search('th').map(&:text)
        headings == ['Name', 'Release Date']
      }
      content_post_table
      content_post_rows  = content_post_table.search('tbody tr')
      content_post_rows[0..-2].reverse_each.map { |row|
        columns      = row.search('td')
        title        = columns[0].text.strip
        published_at = columns[1].text.strip
        show_path    = columns[0].at('a')['href']
        show_url     = URI.join('https://getdpd.com', show_path)
        id           = show_url.query[/post_id=(\d+)/, 1]
        {
            title:        title,
            published_at: Time.parse(published_at),
            show_url:     show_url.to_s,
            id:           id.to_i
        }
      }
    end
  end
end

It's this line here that bugs me:

content_post_rows[0..-2].reverse_each do.map { |row|
  # ...

Encoded in this line is some very special, detailed knowledge of the structure of the DPD subscription content post listing. There are two obscure pieces of knowledge in particular:

  1. That the last row of the table contains buttons, not an episode summary, and so it should be excluded.
  2. …and that the list is in reverse-chronological order.

I gathered these facts by careful trial and error. But rather than revealing what I learned in the code, I've hidden the knowledge. No one coming along after the fact and reading this code could be expected to understand why the rows are sliced and reversed.

Of course, I could add some comments explaining the "why" of this code. But comments are notoriously fragile things; because it's so easy to update the code while neglecting to update the comments, they are often lies waiting to happen.

Instead, I'm going to apply the "Intention Revealing Message" pattern to this code. This pattern is described by Kent Beck in Smalltalk best Practice Patterns.

I start with the array slicing. I write a new method, #select_content_post_data_rows, which receives a rows parameter. Inside, it simply performs the slice, selecting all but the last row.

def select_content_post_data_rows(rows)
  rows[0..-2]
end

I add a call to this method above the original code, assign the result to a variable, and then replace the old array slice with the new variable.

data_rows = select_content_post_data_rows(content_post_rows)
data_rows.reverse_each do.map { |row|
  # ...

I then repeat this procedure for the array reversal. I write a new method, #order_content_post_rows_chronologically, which receives a rows parameter and returns a reversed enumerator for it.

def order_content_post_rows_chronologically(rows)
  rows.reverse_each
end

Once again, I call this method, assign the result to a variable, and then substitute the new variable into the original code.

data_rows = select_content_post_data_rows(content_post_rows)
chronological_rows = order_content_post_rows_chronologically(data_rows)
chronological_rows.reverse_each do.map { |row|
  # ...

I've made my original method longer, not shorter with this change. And from the point of view of the computer, this new version does the exact same thing as the original.

But from the reader's perspective it's a very different story. It is now very clear what's going on here: first, we select only the rows with post data in them. Then, we order the rows chronologically.

This may seem like a lot of work for a small gain, but I promise you that future maintainers of your code will thank you for it. And you can substantially reduce the effort involved by either using a refactoring IDE like RubyMine, which has the "extract method" refactoring built-in; or by writing a macro in your editor of choice.

That's all for today. Happy hacking!

Responses