In Progress
Unit 1, Lesson 1
In Progress

Lazy Zip

Shiny protons! Infinite streams! Producer blocks! Lazy enumerators! In this video about using a functional style to work with potentially infinite collections, we’ll get into some truly weird and wonderful corners of Ruby.

Video transcript & code

So, Ruby enumerable objects have this zip method. This video is not an introduction to zip, but just as a review, zip gives us a kind of "interleaving" across multiple collections. Here's an example, where we have daily temperature samples from multiple cities. We can use zip to average the "columns" across the arrays, one for each day of the week.

#      S   M   T   W   T   F   S
STL = [85, 58, 65, 70, 63, 59, 66]
TYS = [76, 77, 63, 64, 73, 77, 79]
BNA = [83, 75, 64, 68, 74, 79, 76]
SFO = [86, 86, 94, 84, 76, 71, 71]

"SMTWTFS".chars.zip(STL, TYS, BNA, SFO).map { |day, *temps|
  [day, temps.sum / temps.size]
}
# => [["S", 82],
#     ["M", 74],
#     ["T", 71],
#     ["W", 71],
#     ["T", 71],
#     ["F", 71],
#     ["S", 73]]

This example uses arrays as inputs. And the output has is immediately and completely generated.

But what about other kinds of collections? What about infinite streams? What if we just want to pick off a few values instead of this "eager" iteration?

Let's explore these questions! But first, we're going to require the "spicy-proton" gem.

require "spicy-proton"

spicy-proton is a nifty library for generating random adjectives, verbs, and nouns.

Let's play with it a little bit. We'll instantiate a generator...

words = Spicy::Proton.new

And then we can ask this generator for a random adjective

words.adjective
  # => "crummy"

A random verb

words.verb
  # => "deciding"

Or a random noun.

words.noun
  # => "litter"

It can also generate combinations!

words.pair
  # => "lovelorn-inquisitor"

But... let's pretend for now that we didn't see that. Let's say we'd like to create our own generator of an infinite stream of adjective/noun combinations.

We'll create two streams, one for random adjectives and one for random nouns. We construct the streams using Ruby Enumerator objects that will call the given block over and over again.

We can see that these objects are not arrays.

adjectives = Enumerator.produce { words.adjective }
# => #<Enumerator: ...>
nouns = Enumerator.produce { words.noun }
# => #<Enumerator: ...>

But like arrays we can take the first few objects from them. Here's a the first few random adjectives, and the first few random nouns.

adjectives.take(3)
  # => ["sneaky", "stark", "worldwide"]
nouns.take(3)
  # => ["remembrance", "ketchup", "pledge"]

We can also ask for the next object.

adjectives.next  # => "unprofitable"
nouns.next  # => "version"

Or to make it look indistinguishable from working with an array, we could use first.

adjectives.first  # => "anthropological"
nouns.first  # => "glitter"

This is great. We have some enumerable generators now that call spicy-proton methods as-needed.

And since enumerables always have a zip method, we can make a phrase generator by zipping these two streams together!

adjectives.zip(nouns)

...except not quite. Because while methods like take, next, and first are only run the producer block as-needed, zip is still written to be "eager". Meaning that it will try to keep running until it produces a complete result array.

We can see this if we create a counter variable, and add a producer guard clause that increments the variable and raises an exception if it exceeds 100.

When we run this, we see that we hit our guard condition immediately. zip was eagerly pulling values from an infinite generator, and would have kept it up forever if given a chance.

require "spicy-proton"

words = Spicy::Proton.new

c = 0
adjectives = Enumerator.produce {
  fail "whoah there" if (c+=1) > # ~> ArgumentError: comparison of Integer with String failed
  words.adjective 
}
  # => #<Enumerator: ...>
nouns = Enumerator.produce { words.noun }
  # => #<Enumerator: ...>

adjectives.zip(nouns)

# ~> ArgumentError
# ~> comparison of Integer with String failed
# ~>
# ~> tapas.rb:7:in `>'
# ~> tapas.rb:7:in `block in <main>'
# ~> tapas.rb:14:in `each'
# ~> tapas.rb:14:in `each'
# ~> tapas.rb:14:in `zip'
# ~> tapas.rb:14:in `<main>'

This is a reminder that not every Enumerable method was written to play nicely with infinite, generated streams of elements.

Fortunately, that's exactly why Ruby also has a concept of lazy enumerators.

We use the lazy method to get a lazy enumerator from the adjectives stream.

And then call the lazy version of zip, with nouns as the first. This returns another lazy enumerator.

We'll assign the result of this call to the variable "phrases".

Now we can pull adjective/noun combinations off the top of our infinite random phrase stream, one after another!

require "spicy-proton"

words = Spicy::Proton.new

adjectives = Enumerator.produce { words.adjective }
# => #<Enumerator: ...>
nouns = Enumerator.produce { words.noun }
# => #<Enumerator: ...>

phrases = adjectives.lazy.zip(nouns)
# => #<Enumerator::Lazy: ...>

phrases.next.join(" ")  # => "unblemished fire"
phrases.next.join(" ")  # => "stale sailboat"
phrases.next.join(" ")  # => "embryonic personae"

This is probably not the most practical approach to this particular problem, especially considering that the spicy proton library also has explicit methods already defined for generating these kinds of word combos. But it does demonstrate that we can use zip with collections other than arrays, and we can use it in a lazy, streaming style. More generally, it's an illustration that pretty much any functional-style programming we can do with fixed-size collections, we can also do with generated streams of values.

Happy hacking!

Responses