In Progress
Unit 1, Lesson 21
In Progress

Predicate Return Value Part 2

Video transcript & code

In the last episode we looked at how it's possible to write a working predicate method that returns values other than literal true and false. And we saw that Ruby actually defines some methods like this in its core libraries. Today, as promised, we're going to dig into some of the not-so-obvious gotchas that can result from using methods such as these.

One way that things can get confusing is when we factor the XOR operator into the mix. We introduced this operator in episode #43.

Consider that File.size? returns an integer for any non-zero-length file. What if we try to do a logical XOR between the output of two different uses of File.size? If both files exist and have content, we would expect a logical exclusive OR operation to return a false or at least falsy value. Since the definition of XOR is that it only returns true when one and only one side of the comparison is truthy.

But that's not what we get. Instead, we get another integer. Why? Because the caret operator means something different to an integer. In that context, it applies binary XOR, which XORs the individual bits in each number together, and returns the resulting number.

File.write("foo", "hello")
File.write("bar", "world!")

File.size?("foo")               # => 5
File.size?("bar")               # => 6

File.size("foo") ^ File.size("bar") # => 3

Another issue—in fact, the issue most people think of first—with non-boolean predicate methods is that client code may come to depend on the specific return value of a predicate when it's really an implementation detail. For instance, here's some code that assigns the result of the method to a local variable, and then uses it in output.

require "./coffee3"

c = Coffee.new
c.sweetener = "white sugar"

if sweetener = c.sweetened?
  puts "Coffee sweetened with #{sweetener}"
else
  puts "Coffee"
end

# >> Coffee sweetened with white sugar

If we were to then change the method to return literal true or false values, code like this would no longer work as expected.

This may seem like a pathological example, and you could be forgiven for replying to it by saying: clients shouldn't write code like that. But there are less obvious gotchas that can arise from returning arbitrary values from a predicate.

Here's one: when we capture the results of a predicate, we may cause unexpected memory growth. For instance, let's say we have a whole bunch of coffees, some of which are sweetened and some of which aren't. What if we then map over those coffees and capture their sweetness state. In this example, we introduce a new Sweetener class, instead of just using strings.

The resulting array contains nils and Sweetener objects instead of trues and falses. When we use the new ObjectSpace capabilities in Ruby 2.1 to see how many objects are reachable from the array we've generated, we see that the count is quite high. So long as we keep this array of results around, the Sweetener objects it accidentally contains can not be garbage collected—even if the Coffee objects they belong to are garbage-collected!

require "./coffee3"

Sweetener = Struct.new(:type)

coffees = Array.new(20) {
  c = Coffee.new
  c.sweetener = [nil, Sweetener.new("stevia"), Sweetener.new("sugar")].sample
  c
}
sweet_states = coffees.map(&:sweetened?)
# => [#<struct Sweetener type="stevia">, #<struct Sweetener type="stevia">, #...
#     nil,
#     nil,
#     nil,
#     nil,
#     nil,
#     #<struct Sweetener type="stevia">,
#     #<struct Sweetener type="stevia">,
#     nil,
#     #<struct Sweetener type="sugar">,
#     #<struct Sweetener type="sugar">,
#     nil,
#     #<struct Sweetener type="stevia">,
#     nil,
#     #<struct Sweetener type="stevia">,
#     nil,
#     #<struct Sweetener type="sugar">,
#     #<struct Sweetener type="stevia">,
#     #<struct Sweetener type="sugar">,
#     nil]

require "objspace"
ObjectSpace.reachable_objects_from(sweet_states).count
# => 17

In this example, these are relatively small objects. But if we were working with, for instance, ActiveRecord objects, which in turn have their own subsidiary attributes, we could easily fix quite a number of objects in memory without meaning to.

Now let's look at another problem that might sneak up on us. Let's say we are writing some code to render Coffee objects to JSON for transmission to web clients. Not being familiar with the details of how Coffee objects are implemented, we think it's enough to simply use the return value of the sweetened? predicate unmodified.

Unfortunately, the resulting JSON is not what we'd intended. The chances of this JSON being what the client-side code expects are slim, especially if the client is written in a language other than Ruby.

require "./coffee3"
require "json"

def coffee_to_json(coffee)
  { "is_sweetened" => coffee.sweetened? }.to_json
end

c = Coffee.new
coffee_to_json(c)
# => "{\"is_sweetened\":null}"

c.sweetener = "sweet & low"
coffee_to_json(c)
# => "{\"is_sweetened\":\"sweet & low\"}"

This is just one example of a larger issue: Ruby's notion of "truthy" and "falsy" are language-specific, and we have to be careful we don't encode Ruby assumptions when interfacing with non-Ruby languages.

So here's the situation we are left with: some predicates may return values other than literal true and false. And using these predicates in perfectly reasonable ways may result in unintended consequences. We talk about the "Principle of Least Surprise" in Ruby, but in this case it's not clear what's more surprising: a predicate that returns non-boolean values, or a predicate client that uses return values for something other than just their truthiness.

So what do we do? How do we mitigate these issues?

I think this is a case that calls for application of the Robustness Principle. Sometimes called Postel's Law, the Robustness Principle states that we should be conservative in what we do, and be liberal in what we accept from others.

On the one hand, this means that in order to be kind to our clients, when writing predicates it's a good idea to ensure that they return only literal true and false values. This is normally as simple as preceding the return value of a predicate method with a !!, which we learned about in episode #94.

class Coffee
  attr_accessor :sweetener

  def sweetened?
    !!sweetener
  end
end

The double-bang converts truthy and falsy values to their literal boolean equivalent. Thus, nil becomes false, and any sweetener becomes true.

require "./coffee4"

c = Coffee.new
c.sweetened?                    # => false

c.sweetener = "sugar"
c.sweetened?                    # => true

On the flip side of the equation, when using predicates we need to be aware that they may not always return literal boolean values. This means that we should never compare them to a literal true or false. Instead, we should rely on the truthiness or falsiness of a return value in order to make decisions.

if coffee.sweetened? == true
  # ...
end

if coffee.sweetened?
  # ...
end

When capturing and using a predicate return value for something more than just making a decision, we need to be more careful. We should investigate the actual return values of the methods we are calling. And we may want to explicitly force the results to literal true or false values before using them. For instance, in our JSON example, we could have introduced a !! to ensure that the boolean field in our JSON data will always be either true or false.

require "./coffee3"
require "json"

def coffee_to_json(coffee)
  { "is_sweetened" => !!coffee.sweetened? }.to_json
end

c = Coffee.new
coffee_to_json(c)
# => "{\"is_sweetened\":false}"

c.sweetener = "sweet & low"
coffee_to_json(c)
# => "{\"is_sweetened\":true}"

As we've seen in the examples above, predicate return values aren't as clear-cut as they would at first seem. We need to be aware that predicates may return values other than true and false. At the same time, we should be considerate of other programmers who aren't expecting non-boolean values to come out of a predicate method. A little thought and care both when writing predicate methods and when using them can prevent confusion.

Before I go, I want to extend a special thank you to Myron Marston, who clued me in to several of the more obscure gotchas stemming from non-boolean predicate return values.

Happy hacking!

Responses