In Progress
Unit 1, Lesson 1
In Progress

Guilds with Steve Klabnik

Get a sneak peek at the possible future of Ruby concurrency, with guest chef Steve Klabnik!

Video transcript & code

As multi-core processors become more and more ubiquitous, the ability to write concurrent code that can exploit those cores becomes ever more pressing. But if there's one truism about programming, it's that multi-threaded code is really, really hard to get right. Part of the difficulty is that in most programming languages, the APIs available for multi-threaded code are still very low-level, requiring programmers to manually manage the sharing of data among threads.

To make multi-threaded code less error-prone the Ruby core team has been working on building higher-level concurrency abstractions into the language. Today, guest chef Steve Klabnik joins us to give you a sneak peek into these upcoming features.

You might know of Steve from his years of open-source contributions to Rails and various other Ruby projects. These days, he spends his time working on the Rust programming language. Which is fortunate for us, because Rust contains concurrency features which are very similar to the ones proposed for Ruby. Today, he'll show you what threads and channels in Rust can teach you about the future of multi-threaded programming in Ruby. Enjoy!


I love Ruby. I also love other languages. I firmly believe that being a polyglot helps broaden perspectives, and improves your code in all languages. Today, we're going to talk about a possible future feature in Ruby, and how it relates to a completely different language, Rust.

At Ruby Kaigi 2016, Koichi Sasada suggested a very interesting new feature for Ruby: guilds. Honestly, I love the name, even though it's not very descriptive of what it does. Guilds are a feature that will hopefully improve Ruby's concurrency story.

Rust is a programming language that's in many ways the complete opposite of Ruby: it's low-level, it's compiled, it has static types. That's also why I think it's a rich source of inspiration for Rubyists, as it approaches problems in a very different way than Ruby does. Thinking outside the box is good!

Let's talk about guilds! In order to understand guilds, we have to understand Ruby's concurrency and parallelism model as it exists today. Before we begin, I'd like to mention that when I say "Ruby", I really mean "MRI" here. Other implementations of Ruby can implement these things differently. I'm also going to say "concurrency" when I mean "concurrency and/or parallelism", for similar simplicity, even though they're different, but related, problems.

The Ruby virtual machine has a thing called the "global VM lock", or the GVL. This lock protects the VM's internal data structures from concurrency bugs. But to do so, it limits the ability of Ruby code to run in parallel. This is pretty common in languages similar to Ruby, such as Python.

So what's the problem that the GVL is solving here? Well, there's a saying programmers have about concurrency: "Shared mutable state is the root of all evil." Concurrency problems arise when two threads try to mutate a bit of memory at the same time. The classic situation is two threads incrementing a shared counter. They first have to read the current value, add one, and then write the new value back to memory. The bug appears when thread one reads the current value, then the second thread reads the current value. Now they both add one, and write the answer back. They'll have the same answer, and one of our +1s are lost.

If we truly want to unlock the potential of Ruby to be concurrent, we need to solve this problem. "just remove the GVL" is a nice soundbite, but there's problems in practice that we won't go into here.

How does Rust handle this problem? Well, most languages choose to tackle this by focusing on the "mutability" aspect. If you've talked to someone who's a fan of Clojure, for example, they'll tell you that the solution to concurrency issues is something called "immutable data structures." This approach is very good, and has a long tradition in computer science. But Rust takes a different one. Rust focuses on the sharing part, rather than the mutability part. In short, Rust says "the solution isn't to make everything immutable, but be clear about how you're sharing things."

Let's look at this in code. Consider this example:


use std::sync::mpsc; use std::thread;

fn main() { let (tx, rx) = mpsc::channel();

thread::spawn(move || {
let value = rx.recv().unwrap();
println!("got: {:?}", value);
});

let numbers = vec![1, 2, 3];

tx.send(numbers).unwrap();

}

Here, we use a "channel" to communicate between the main thread and a new thread that we create. A channel has two ends: a sending end, and a recieving end. The syntax is very close to Ruby, but also very different: thread::spawn is very close to Thread.new in Ruby: it takes a lambda, and runs that closure in a new thread. In Rust's case, we use || {} rather than do || end.

We send a vector, which is like a Ruby array, down the channel. The thread we've created waits until a value is sent down the channel, and then prints it out.

So this is the happy case. But what if something goes wrong? Let's change the program slightly:


let numbers = vec![1, 2, 3];

tx.send(numbers).unwrap();

println!("numbers: {:?}", numbers);

Here, we're trying to print out the vector of numbers after we've sent it down the channel. If this were Ruby, this would work just fine: the GC would make sure that the array would stay alive for the duration of both threads, and it'd just work. In Rust, however, we get an error at compile time:


error[E0382]: use of moved value: `numbers`
  --> src/main.rs:14:27
   |
12 | tx.send(numbers).unwrap();
   |         ------- value moved here
13 |
14 | println!("numbers: {:?}", numbers);
   |                           ^^^^^^^ value used here after move

Rust says "hey, you said you were sending that to another thread! You can't use it after you've sent it!" This is control over sharing: you can't give something away to two people at the same time.

So what's this have to do with guilds? Guilds are a way to let Ruby know your intentions with sharing your objects, so that it can increase concurrency.

If you've ever studies Ruby's 'Fibers', you might know that each Ruby thread can have many fibers. Well, each guild can contain one or many Threads. Each object belongs to a single guild. Instead of the VM having a lock, each guild has a lock. Now, threads from two different guilds can truly execute in parallel.

Let's take a look at the Rust example, but in Ruby with the proposed guild interface:


guild = Guild.new(script: %q{ n = Guild.default_channel.receive
puts "got: #{n}" })
numbers = [1, 2, 3]
guild.transfer_membership(numbers)

Unlike Rust's threads, Guilds have a built-in channel. We create a new guild with Guild.new, and pass it the script to run. I'm not sure why Koichi chose to use this style rather than a block, but remember, this is all a proposal and therefore subject to change!.

If you try to use numbers after transfer_membership is called, Ruby will raise an exception. This is similar to Rust, and in the same way as the previous example: Ruby will catch these errors at runtime, but is more succinct. Rust will catch these errors at compile time, but is more verbose.

I'm very excited about the idea of guilds. While Koichi hasn't released his initial implementation yet, word on the street says it's very small, on the order of a couple hundred lines of code. I can't wait until it's ready for us all to give it a spin!