In Progress
Unit 1, Lesson 21
In Progress

Pop

Video transcript & code

Let's say we run a subscription screencasting service. Some of our episodes are members-only, and some of the episodes are free for anyone to watch. On our website's home page, we'd like to feature a random selection of the free videos so that visitors can get a sense of the content.

But we don't want the selection to be completely random. In particular, we don't want to accidentally show just videos from a particular miniseries. We want to ensure that the featured samples are picked from a broad range of the show's history.

Here's one way we might handle this. We define the number of sample episodes to show on the front page - let's say 2. Then we calculate a slice size from the total number of free episodes divided by the number of samples to feature. With this number in hand, we can use #each_slice to split the free episode list into equal-sized bins. Each bin represents a different period of time in the show's history.

We can map over those bins, and pick a single random episode out of each one with #sample. After flattening the result, we have our selection of episodes. It is random, but random in such a way that it is impossible for two brand-new or two very old episodes to be selected.

eps = [7, 11, 17, 20, 24, 27, 35, 42, 45, 46]

slice_count = 2
slice_size  = eps.size / slice_count # => 5

slices = eps.each_slice(slice_size).to_a
# => [[7, 11, 17, 20, 24], [27, 35, 42, 45, 46]]

picks = slices.map { |s|
  s.sample(1)                   # => [11], [27]
}.flatten
# => [11, 27]

Of course, it would still be possible for two episodes to be selected from right in the middle of the show's run. In order to ensure a better spread, and because we like the idea of having more featured videos on the homepage, we increase the slice count 3.

Unfortunately, this messes everything up. Because we currently have 10 free episodes to work with, the set doesn't divide evenly into three parts. As a result, when we tell the array to slice itself into 3-element arrays, it also generates a fourth one-element array with the remainder.

eps = [7, 11, 17, 20, 24, 27, 35, 42, 45, 46]

slice_count = 3
slice_size  = eps.size / slice_count # => 3

slices = eps.each_slice(slice_size).to_a
# => [[7, 11, 17], [20, 24, 27], [35, 42, 45], [46]]

picks = slices.map { |s|
  s.sample(1)                   # => [7], [20], [45], [46]
}.flatten
# => [7, 20, 45, 46]

We decide to introduce a special case for this eventuality. If the actual number of slices is greater than the intended slice count, we will do a bit of extra processing. First, we'll find the slice that should have been the final one. Next, we'll remove the leftover slice using pop, and assign that removed slice to a variable. The #pop method, if you're not familiar with it, removes the last item in an array and returns the element it popped off.

Then we'll concatenate the leftovers onto the slice before it.

The result is a set of 3 bins, as originally intended, where the fourth bin has one extra element. We can sample these bins as before, and get the expected number of final picks.

if slices.size > slice_count
  end_slice = slices[slice_count - 1] # => [35, 42, 45]
  leftover  = slices.pop              # => [46]
  end_slice.concat(leftover)          # => [35, 42, 45, 46]
end

slices
# => [[7, 11, 17], [20, 24, 27], [35, 42, 45, 46]]

picks = slices.map { |s|
  s.sample(1)                   # => [17], [20], [45]
}.flatten
# => [17, 20, 45]

This works well, but we hate to introduce a conditional into this algorithm. Conditionals make code harder to reason about, because now in addition to thinking about what each line of code is doing, we also need to think about whether a line of code is being executed at all.

Let's see if we can get rid of this conditional while still handling all cases. We'll do this progressively, moving more and more code out of the conditional until hopefully we can do away with it entirely.

First off, we don't technically need the line which finds the ending slice to be inside the if block. We move that line out.

end_slice = slices[slice_count - 1] # => [35, 42, 45]
if slices.size > slice_count
  leftover  = slices.pop              # => [46]
  end_slice.concat(leftover)          # => [35, 42, 45, 46]
end

For the next move, we need to understand a bit more about how #pop works. If we just send it with no arguments, it returns the last element. Or nil, if there are no elements.

But if we supply an integer argument, #pop snips off and returns an array of that many elements. 1 gets us an array of one, 3 gets an array of 3, and so on. If we supply an argument of zero, we get an empty array back—/not/ a nil value.

a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
a.pop                           # => 10
a                               # => [1, 2, 3, 4, 5, 6, 7, 8, 9]

[].pop                          # => nil

a.pop(1)                        # => [9]
a.pop(3)                        # => [6, 7, 8]
a.pop(0)                        # => []

With this knowledge in mind, we move on to the next step. We initialize a leftover_count variable to zero. Inside the if statement we reset it to 1. Then we use the leftover count as an argument to pop, and flatten the resulting array.

leftover_count = 0
if slices.size > slice_count
  leftover_count = 1
  leftover  = slices.pop(leftover_count).flatten # => [46]
  end_slice.concat(leftover)          # => [35, 42, 45, 46]
end

At this point, we no longer need the pop and concat actions to be inside the conditional block. If there are no leftovers, the leftover count will be zero and the pop will return an empty array and leave the source array untouched. And of course concatenating an empty array to another array is a harmless no-op, so it's safe to execute the concat in either situation.

Accordingly, we move those two lines out of the conditional.

leftover_count = 0
if slices.size > slice_count
  leftover_count = 1
end
leftover  = slices.pop(leftover_count).flatten # => [46]
end_slice.concat(leftover)          # => [35, 42, 45, 46]

At this point, all the if statement is doing is deciding whether to set a variable to 1 or 0. We can simplify this using the ternary operator.

leftover_count = slices.size > slice_count ? 1 : 0
leftover  = slices.pop(leftover_count).flatten # => [46]
end_slice.concat(leftover)          # => [35, 42, 45, 46]

We've now reduced the conditional to a single line, but can we get rid of it completely? Well, remember what we need to do here. The result should be one if the slice size is one greater than the intended slice count, and it should be zero if the two are equal. When we phrase the problem this way, it rings a bell in the arithmetic centers of our brains: we can get the result we need by just subtracting the intended slice count from the actual size of the slices array.

eps = [7, 11, 17, 20, 24, 27, 35, 42, 45, 46]

slice_count = 3
slice_size  = eps.size / slice_count # => 3

slices = eps.each_slice(slice_size).to_a
# => [[7, 11, 17], [20, 24, 27], [35, 42, 45], [46]]

end_slice = slices[slice_count - 1] # => [35, 42, 45]
leftover_count = slices.size - slice_count     # => 1
leftover = slices.pop(leftover_count).flatten # => [46]
end_slice.concat(leftover)          # => [35, 42, 45, 46]

slices
# => [[7, 11, 17], [20, 24, 27], [35, 42, 45, 46]]

picks = slices.map { |s|
  s.sample(1)                   # => [17], [24], [46]
}.flatten
# => [17, 24, 46]

When there's a leftover slice, the leftover count will be 1 and the leftovers will be moved into the last full bin. If we switch the desired slice count to one that evenly divides the list, the leftover count is zero and the concatenation has no effect.

At this point, we have achieved our goal: we've found a way to divide a set of videos into equal or nearly equal sized bins, even when the division results in a leftover element. And we've done it without any conditionals at all. Along the way, we've seen that the #pop method has more to it than just removing and returning single values. Happy hacking!

Responses