In Progress
Unit 1, Lesson 1
In Progress

Function Pipelining in Ruby

Have you ever looked at function “pipelines” in FP languages like Elixir and F# and wished Ruby had them? Let’s explore the equivalents Ruby provides out-of-box, or with a little tweaking.

Video transcript & code

If you’ve ever played with a functional programming language, you might have seen how many of those languages make it easy to chain a series of functions together into a kind of pipeline.

For instance, here’s an example in the Elixir

programming language. This code converts an article title into a URL-compatible “slug” by piping it through a series of String transformation functions.

input = "  10 Weird Tricks to Trick Weirdos!!! - Happy Fun Clickbait Co. "
slug = input
  |> String.downcase
  |> String.replace(~r/[^\w]+/, " ")
  |> String.trim
  |> String.replace(" ", "-")
  |> String.slice(0, 32)
IO.inspect slug
# >> "10-weird-tricks-to-trick-weirdos"

As more and more Ruby developers experience this kind of elegant functional pipelining in other languages, there has been growing interest in bringing them to Ruby. In fact, as of 2019, Matz and the Ruby core team have been debating how to incorporate this feature into the language.

But today, let’s talk about how we might approximate it with the tools we have already.

When we look at code like this, the first thought we might have about pipelining in Ruby is that… we already have it!

After all, we could trivially transform this snippet into a Ruby method call chain.

input = "  10 Weird Tricks to Trick Weirdos!!! - Happy Fun Clickbait Co. "
slug = input  
  .downcase
  .gsub(/[^\w]+/, " ")
  .strip
  .tr(" ", "-")
  .slice(0, 32)
p slug
# >> "10-weird-tricks-to-trick-weirdos"

Not only is this equivalent to the Elixir version, it’s actually more concise. Since the message receiver is implicitly a String, we don’t have to keep reiterating the String namespace.

So… is that it? Are we done? Do objects in Ruby give us this functionality for free?

Let’s take a look at a different example. Let’s say we have a set of callable objects which implement a workflow for adding a new user to a system.

There’s a receive_request callable that returns a hash of information about the new user.

There’s a validate_request callable that takes in a hash in this form, and returns it unchanged after making sure it has the right fields.

There’s a canonicalize_email callable that takes the hash and returns it with the email field cleaned up.

Next up comes an update_db_from_request callable that has a side-effect of creating the new user in the database, and returns the hash updated with a newly generated user_id.

A send_email callable kicks off a welcome email to the new user, and adds a welcome_email_sent_at timestamp to the data bundle.

Finally, a return_message callable generates some response fields suitable for being rendered by a web framework controller layer. It also attaches the original request fields.

result = receive_request.call
# => {:type=>"register_user",
#     :username=>"arthur.dent",
#     :email=>"Arthur Dent <dent@example.org"}
result = validate_request.call(result)
# => {:type=>"register_user",
#     :username=>"arthur.dent",
#     :email=>"Arthur Dent <dent@example.org"}
result = canonicalize_email.call(result)
# => {:type=>"register_user",
#     :username=>"arthur.dent",
#     :email=>"dent@example.org"}
result = update_db_from_request.call(result)
# => {:type=>"register_user",
#     :username=>"arthur.dent",
#     :email=>"dent@example.org",
#     :user_id=>"12345678"}
result = send_email.call(result)
# => {:type=>"register_user",
#     :username=>"arthur.dent",
#     :email=>"dent@example.org",
#     :user_id=>"12345678",
#     :welcome_email_sent_at=>2020-03-14 14:22:08.68545719 +0000}
result = return_message.call(result)
# => {:status=>201,
#     :message=>"New user account created for dent@example.org",
#     :request=>
#      {:type=>"register_user",
#       :username=>"arthur.dent",
#       :email=>"dent@example.org",
#       :user_id=>"12345678",
#       :welcome_email_sent_at=>2020-03-14 14:22:08.68545719 +0000}}

I shamelessly stole this domain-action example from Scott Wlaschin’s talk about Railway-Oriented Programming, which despite the name has nothing to do with Ruby on Rails. I’ll put a link in the notes.

I want to make a few notes about this code.

First, what are these “callable” objects? Well, they could be implemented a lot of ways. These things could be classes, or instances of classes, or Ruby Proc objects, Method objects, or a mix of the above. It doesn’t matter. What matters is that they are call-able: that is, they consistently respond to the call message. And in this example, they consistently consume a hash as input, and produce an updated hash as output.

Some Ruby and Rails developers have taken to using callable objects like this to organize and encapsulate logic in their applications. They are often referred to as “service objects” in that context.

Callable objects are the closest Ruby equivalent to functions in the “functional programming” sense, because they represent some action which can be passed around as a first-class entity and invoked in a consistent way. Any time we want to emulate Functional programming language behavior in Ruby, it usually makes sense to start with callable objects. From here on out, I’m going to use the terms “callable object” and “function” interchangeably.

The second thing to observe here is that we can’t chain these things together the way we did with a series of String transformations.

If we try, we get an exception.

receive_request.call
  .validate_request # ~> NoMethodError: undefined method `validate_request' for #<Hash:0x0000555a1a83aaf8>

# ~> NoMethodError
# ~> undefined method `validate_request' for #<Hash:0x0000555a1a83aaf8>
# ~>
# ~> 02-domain-actions.rb:102:in `<main>'

The callable objects produces a Hash object as their return value, and we can’t very well send the validate_request message to a Hash. It has no idea what we’re talking about.

This illustrates a basic difference between our String example and this new one. Before, we were doing data transformation on a one of the languages built-in types. This time, we’re performing domain actions and using a core type, Hash, to convey state from one to the next. This is also much closer to the functional programming paradigm, where functions and data are defined separately and orthogonally from each other.

Now, what about pipelines?

Well, we’ve already effectively established a pipeline here, with our idiom of repeatedly assigning the result variable and then passing it to the next function. But what if we want a more concise and elegant notation?

As a matter of fact, since version 2.5 Ruby gives us a way to pass data “through” an arbitrary function, in the form of the yield_self method available on all objects.

receive_request.call()
  .yield_self{|result| validate_request.call(result)}
  .yield_self{|result| canonicalize_email.call(result)}

Here we’re passing the result of receive_request through validate_request.

All we want to do in this block is invoke a callable object on the block arguments.

Ruby gives us a much more concise notation for this common case, using the & operator to use a callable object as the block.

receive_request.call()
  .yield_self(&validate_request)
  .yield_self(&canonicalize_email)

And if we’re using Ruby 2.7 or later, we can use the alias then instead of yield_self.

receive_request.call()
  .then(&validate_request)
  .then(&canonicalize_email)

Let’s go ahead and complete our domain action pipeline. Update the database… send an email… compose a return message…

receive_request.call()
  .then(&validate_request)
  .then(&canonicalize_email)
  .then(&update_db_from_request)
  .then(&send_email)
  .then(&return_message)
# => {:status=>201,
#     :message=>"New user account created for dent@example.org",
#     :request=>
#      {:type=>"register_user",
#       :username=>"arthur.dent",
#       :email=>"dent@example.org",
#       :user_id=>"12345678",
#       :welcome_email_sent_at=>2020-03-14 14:47:23.145470007 +0000}}

This is the modern, idiomatic Ruby equivalent to the pipeline operator in many functional languages. But… what if we want it to be an actual operator?

Let’s use refinements to make a bounded change to the Object class. We’ll define the right-shift operator to have a default behavior of invoking yield_self, treating the right-hand argument as the block.

module Pipelines
  refine Object do
    def >>(callable)
      yield_self(&callable)
    end
  end
end

Then we’ll put this refinement in effect locally.

using Pipelines

Now we can recreate our domain action pipeline in a way that actually looks like a pipeline, with the output of each function being piped into the next.

receive_request.call() >>
  validate_request >>
  canonicalize_email >>
  update_db_from_request >>
  send_email >>
  return_message
# => {:status=>201,
#     :message=>"New user account created for dent@example.org",
#     :request=>
#      {:type=>"register_user",
#       :username=>"arthur.dent",
#       :email=>"dent@example.org",
#       :user_id=>"12345678",
#       :welcome_email_sent_at=>2020-03-14 14:47:23.146741206 +0000}}

One limitation to be aware of is that Ruby’s parsing rules for the rightshift operator preclude us from writing the operator on the left. Ruby flags this as a syntax error.

receive_request.call()
  >> validate_request

But other than that little gotcha, this approach lets us imitate the functional pipeline style, composing workflows out of self-contained actions.

The question is: is this worth it? Is it a good idea? In future episodes, we’ll talk about why this approach to composing actions may not be the most idiomatic or fruitful in Ruby. We’ll also talk about how naïve, surface level adaptations of functional patterns can deny us the true power of function composition.

But for now… happy hacking!

Responses