In Progress
Unit 1, Lesson 21
In Progress

Group By

Grouping data by various properties is easy in Ruby, and there are more uses for it than you might think…

Video transcript & code

In episode #475, we were messing around with a virtual deck of cards. Let's take 8 cards off the top.

Notice that this time around, our deck comes pre-shuffled.

Our cards are instances of a class called Card.

They each have properties for rank and suit.

require "./deck"

DECK.take(8)
# => [Ace of Clubs,
#     Ace of Diamonds,
#     King of Clubs,
#     King of Spades,
#     7 of Hearts,
#     3 of Clubs,
#     5 of Clubs,
#     King of Hearts]
DECK.first.class
# => Card
DECK.first.rank                 # => 1
DECK.first.suit                 # => "Clubs"

Sometimes when we're holding a set of cards, we want to group them in certain ways. One way to do this is to sort them by a property, a technique we learned about in episode #181.

We end up with a list that is sorted by a particular property of the cards. We can also see some implicit groupings in this result: For instance, there's a group of aces , and a group of kings.

require "./deck"

DECK.take(8).sort_by(&:rank)
# => [Ace of Clubs,
#     Ace of Diamonds,
#     3 of Clubs,
#     5 of Clubs,
#     7 of Hearts,
#     King of Spades,
#     King of Clubs,
#     King of Hearts]

What if we wanted these groupings to be a little more explicit? That's where the Enumerable#group_by method comes in.

group_by takes a block, and the block receives each element of the collection as an argument.

Inside the block, we return the property by which we want the collection to be grouped, which in this case is the card's rank.

When we execute this code, we can see that the result is a hash instead of an array. The keys of the hash are card ranks, and the values are arrays of cards.

require "./deck"

DECK.take(8).group_by{|card| card.rank}
# => {1=>[Ace of Clubs, Ace of Diamonds],
#     13=>[King of Clubs, King of Spades, King of Hearts],
#     7=>[7 of Hearts],
#     3=>[3 of Clubs],
#     5=>[5 of Clubs]}

Let's switch this to group by suit instead of rank.

require "./deck"

DECK.take(8).group_by{|card| card.suit}
# => {"Clubs"=>[Ace of Clubs, King of Clubs, 3 of Clubs, 5 of Clubs],
#     "Diamonds"=>[Ace of Diamonds],
#     "Spades"=>[King of Spades],
#     "Hearts"=>[7 of Hearts, King of Hearts]}

There is a shorthand way to write this kind of grouping. Since all are doing in the block is sending a single, zero argument message, we can use Ruby's symbol to proc syntax sugar to omit an explicit block.

require "./deck"

DECK.take(8).group_by(&:suit)
# => {"Clubs"=>[Ace of Clubs, King of Clubs, 3 of Clubs, 5 of Clubs],
#     "Diamonds"=>[Ace of Diamonds],
#     "Spades"=>[King of Spades],
#     "Hearts"=>[7 of Hearts, King of Hearts]}
DECK.take(8).group_by(&:rank)
# => {1=>[Ace of Clubs, Ace of Diamonds],
#     13=>[King of Clubs, King of Spades, King of Hearts],
#     7=>[7 of Hearts],
#     3=>[3 of Clubs],
#     5=>[5 of Clubs]}

Sometimes we want to group collections by more complex criteria than the value of a property. For instance, let's say we want to divide our hand of cards into face cards and non-face cards, which I recently learned are known as "pip cards".

Since the group_by method takes a block, we can do any kind of calculation we want to determine the group. in this case, we just check to see whether the given card is a face rank or not, and return a group name based on the result.

require "./deck"

DECK.take(8).group_by{|card|
  (11..13).include?(card.rank) ? "face" : "pip"
}
# => {"pip"=>
#      [Ace of Clubs, Ace of Diamonds, 7 of Hearts, 3 of Clubs, 5 of Clubs],
#     "face"=>[King of Clubs, King of Spades, King of Hearts]}

Up to this point, we've been basing our groupings on attributes of the card objects. But but with a little imagination, we can use group by for other types of categorization.

If you watched episode #475, you might recall that in it we explored a couple of different ways of dealing cards to players.

With group_by, we can implement yet another form of card dealing.

First, will define a deal_progression.

For its value, we send the cycle message to the player list. But unlike in episode #475, we don't pass any block to it.

As a result, the value we get back is an Enumerator.

Just as a reminder, an Enumerator gives us an external, lazy iterator over a sequence. it gives us values out of the sequence as we ask for them, but not before. In this case, the sequence is an endless cycle through the list of players.

require "./deck"

PLAYER_LIST = %w[Groucho Chico Harpo]

deal_progression = PLAYER_LIST.cycle
# => #<Enumerator: ...>
deal_progression.next           # => "Groucho"
deal_progression.next           # => "Chico"
deal_progression.next           # => "Harpo"
deal_progression.next           # => "Groucho"

Next let's deal enough cards off the deck to give 3 cards to each player.

Finally, we will group this deal by the deal progression we defined a moment ago. so the value of the group_by block will first be Groucho, then Chico, than Harpo, then Groucho again and so on.

Notice that in this block, unlike in earlier blocks, we are forgoing the block argument. For the purposes of this grouping scheme, the actual card doesn't matter. All that matters is which player to whom we are currently dealing.

Let's run this.

The outcome, as before, is a hash of string keys to arrays of cards. This time, the keys are players, and the arrays are the hands dealt to the players.

require "./deck"

PLAYER_LIST = %w[Groucho Chico Harpo]

deal_progression = PLAYER_LIST.cycle
deal             = DECK.take(3 * PLAYER_LIST.size)
deal.group_by{ deal_progression.next }
# => {"Groucho"=>[Ace of Clubs, King of Spades, 5 of Clubs],
#     "Chico"=>[Ace of Diamonds, 7 of Hearts, King of Hearts],
#     "Harpo"=>[King of Clubs, 3 of Clubs, 10 of Clubs]}

What I like about this implementation a card dealing is the separation of concerns. The deck, the list of players, the dealing order, and the cards to be dealt are all represented as discrete objects. in the end we combine them, and get the result we want in a very functional, side-effect-free style.

One last thing about grouping. Up until now we've been grouping arrays. But what about Ruby sequences other than arrays? Can we still group them?

Let's find out. Let's define a range of numbers from 0 to 12.

This time, our sequence is of the Range class that we learned about in episode #232.

Now let's group the numbers in the range by whether they are even, or odd.

As a result, once again we get a hash of keys to arrays.

range = (0..12)                 # => 0..12
range.class                     # => Range

range.group_by{|n| n.even? ? "even" : "odd"}
# => {"even"=>[0, 2, 4, 6, 8, 10, 12], "odd"=>[1, 3, 5, 7, 9, 11]}

So we can see that the group_by message can be sent to any Enumerable object in Ruby, not just arrays.

And that's it for today. Happy hacking!

Responses