In Progress
Unit 1, Lesson 1
In Progress

Rake Custom Task

Video transcript & code

If you've done much with the Rake utility, you know that it's great for automating file transformations. If we have a markdown file that needs to be turned into an HTML file, or a C file that needs to be compiled to a binary, Rake is great.

But sometimes the product we want to create isn't a file at all. For instance, considered the case where we have local content files which need to be posted to a blogging service.

Let's create a Rakefile for such a case.

We'll use the example of a WordPress blog, and so we'll bring in the rubypress gem to handle the details.

For simplicity, we'll just define a global WordPress client object.

require "rubypress"

  host:     ENV.fetch("WP_HOST"),
  username: ENV.fetch("WP_USERNAME"),
  password: ENV.fetch("WP_PASSWORD"),
  use_ssl:  true)

I'm not going to go into any detail on how this library is used. All you really need to know is that it exposes methods for listing, creating, and updating blog posts on a WordPress host.

Now, it would be straightforward enough to define a Rake task that uploads a new blog post based on the contents of a local HTML file.

task "blogpost" do
    content: {
      post_status: "publish",
      post_title: "Hello World" })

This is pretty straightforward, even if you've never used the rubypress gem before. The only noteworthy point is that we're using the contents of a file called post_content.html as the body of the new post.

The thing is, making a rake task for this doesn't really buy us anything. We might as well have written a raw Ruby script.

The real power of Rake lies in its smart handling of dependencies. When we're working with local files, Rake is able to determine whether a file needs to be rebuilt or not, by tracing the dependency graph and working out if a prerequisite has been updated.

Wouldn't it be nice if we could have the same power here? If we could have rake only submit the blog post if it either doesn't already exist, or if the local file has been updated?

Let's make this happen.

The technical term for the product of a build task is a "target". The key to integrating Rake with a new kind of target is creating a new task class.

We'll call our new class PostTask, and we'll inherit it from the Rake::Task superclass.

class PostTask < Rake::Task
  # ...

Before we add the methods we need to make this into a fully-functional customized task, we're first going to add some helpers that will enable us to query the status of the remote blog server.

Working from bottom to top: first we have a method which fetches and caches a list of blog posts, requesting just a small set of metadata for each post.

Next we have a post_info method which looks up info for the specific post this task pertains to, if it exists.

WordPress posts have an internal "name" string, and for simplicity we'll be matching up the name of a Rake task with the name of the blog post. If you're not sure what I mean by the "name of the task", hang in there. It'll become more clear shortly.

Building on this info method, we have a convenience method for pulling out the blog post's ID.

And another helper for getting the blog post's modification timestamp.

def timestamp

def post_id
  post_info && post_info["post_id"]

def post_info
  post_list.detect{|post| post["post_name"] == name}

def post_list
  @post_list ||= WPCLIENT.getPosts(
    fields: %w[post_id post_name post_modified_gmt])

This gives us a solid base to build on. Now we need to make this into a full-fledged Rake task.

It turns out that to customize the dependency calculation for our rake task, we only need to provide a single method.

The #needed? method is what Rake uses to determine if a task needs to be invoked or not.

How do we determine if the blog post needs to be created or updated? We need to check if it doesn't exist, or if it is out of date compared to its dependencies.

def needed?
  !post_exists? || out_of_date?

These two methods don't actually exist yet. Let's define them.

The post_exists? predicate is simple enough. We'll just check to see if we have are able to find any post info.

def post_exists?

Calculating if the post is out of date with regard to prerequisites is a little more tricky. Fortunately, we can cheat.

Let's check out the implementation of this method in Rake's FileTask class.

def out_of_date?(stamp)
  @prerequisites.any? { |n| application[n, @scope].timestamp > stamp }

This code goes through the task's list of prerequisites. That is, the list of other tasks that it depends on. For each prerequisite, it looks up the corresponding task object, and then checks the timestamp to see whether it is newer than this file's timestamp.

Let's just swipe this code.

Then we'll simplify it by removing the parameter and using the timestamp method we defined earlier.

Remember, our timestamp method uses the blog post timestamp as reported by the remote blog server.

The only other change we make is to ensure that the prerequisite timestamps are converted to UTC before the comparison.

def out_of_date?
  @prerequisites.any? { |n|
    application[n, @scope].timestamp.utc > timestamp

Our custom task class is now finished. It has everything it needs to determine whether it should be invoked or skipped, based on the status of the associated blog post and the age of its dependencies. We could have done this with fewer methods, but this implementation keeps our methods small and focused.

The last thing we need before we can put this new class to use is a short method for declaring post tasks.

We'll define it exactly the same way Rake's built-in task and file methods are written, forwarding the arguments and block to the define_task class method.

def post(*args, &block)
  PostTask.define_task(*args, &block)

Let's test this out. Going into a terminal, we can run rake -T and see our task listed.

$ rake -T
rake test-blog-post  # Create or update a blog post

If we invoke this task, we see that it thinks for a moment and then creates a new blog post.

$ rake my-blog-post
Create new post...

If we then run the same task again, we see that nothing happens.

$ rake my-blog-post

This is because Rake found the associated blog post, compared its modification time to the timestamp of the local content file, and decided nothing needed to be done.

But, if we update the timestamp on the local file by touching it, and then re-run the rake task, we see that it updates the blog post this time.

$ touch post_content.html
$ rake my-blog-post
Update post ID 33...

This is exactly what we want to see. We've now brought the dependency-tracing power of Rake from the world of local files, into the world of the web.

The approach we used here can be extended to just about any kind of writable online resource. As long as there's a way to track the remote resource's modification time, we can teach Rake how to include it in the dependency graph just by defining a new kind of task.

Happy hacking!