In Progress
Unit 1, Lesson 1
In Progress

Magic Subshell

Video transcript & code

Long before Ruby was a preeminent web application programming language, Ruby was scripting language for "gluing" other programs together in the tradition of Perl. As a "glue language", Ruby goes out of its way to make it easy to execute subprograms and little shell snippets. In fact, sometimes it tries so hard to make this kind of programming painless that things can get a little bit magical, surprising, and even dangerous—as we'll see today.

Back in episode #385, we discovered an annoyance about executing other programs from within Ruby. If we use the system method to execute the program, but we misspell the name, we don't get any kind of error message. Instead, we get a nil result.

If we know where to look, we can find out that the exit status was 127.

system "pigmentize -o out.html in.html" # => nil
$?.exitstatus                   # => 127

As we learned in that episode, 127 means "command not found".

If we executed the same misspelled command at the terminal, we would get an error message.

$ pigmentize
No command 'pigmentize' found, did you mean:
 Command 'pygmentize' from package 'python-pygments' (main)
pigmentize: command not found

Looking at this, we might think "wait a sec". "Maybe we didn't see an error message just because the message went to standard error instead of standard output?" To test this theory, we could add the magic shell gobbledygook for redirecting the standard error stream to standard output.

system "pigmentize -o out.html in.html 2>&1" # => false
# >> sh: 1: pigmentize: not found

Lo and behold this works.

It seems like it has validated our theory.

In fact, though, this is a false positive. And if we continued forward with this incorrect understanding, it could send us down some blind alleys in the future. And yes, I speak from experience.

We can start to see that our understanding of the situation is false if we try a different redirection. According to our theory, if we instead redirect standard output to /dev/null, we should no longer see an error message, since we are no longer routing the standard error stream to standard output.

But in fact, we see the same error message.

system "pigmentize -o out.html in.html 1>/dev/null"        # => false

# !> sh: 1: pigmentize: not found

So what is really going on here? To get a better understanding of what's actually happening, we can turn to the strace utility.

We're going to first save a version of the program that has the correct spelling and does no redirection.

system "pygmentize -o out.html in.html"

strace is a linux utility that can show us the system calls a program makes as it is executing.

There are similar tools for other operating systems, like dtrace on Max OS X.

We'll tell strace that we want to trace down through any subprocesses, and that we are only interested in seeing information about calls to the execve function. That's a function which starts up a new subprocess.

We also specify some other options to limit the amount of extra output we get from strace.

At the end of the command, we put the ruby command we want to trace.

When we execute this, we see a couple of lines of strace logging output.

First, we see a call to execve where the Ruby process is started.

Next, we see an execve that executes the pygmentize command.

$ strace -f -e execve -e signal= -qq ruby goodcmd.rb
execve("/home/avdi/.rubies/ruby-2.3.0/bin/ruby", ["ruby", "goodcmd.rb"], [/* 79 vars */]) = 0
syscall_318(0x7ffd73c0e4c0, 0x10, 0, 0x55ba5d0fdf50, 0x55ba5d08fc20, 0) = 0x10
syscall_318(0x7ffd73c0ee80, 0x10, 0, 0x55ba5d0ce348, 0, 0x55ba5c698760) = 0x10
[pid 16538] execve("/usr/bin/pygmentize", ["pygmentize", "-o", "out.html", "in.html"], [/* 79 vars */])
= 0

So far so good. Now let's change the script to include some redirection.

system "pygmentize -o out.html in.html 2>&1"

And now let's run the command under strace again.

$ strace -f -e execve -e signal= -qq ruby goodcmd-redir
execve("/home/avdi/.rubies/ruby-2.3.0/bin/ruby", ["ruby", "goodcmd-redir.rb"], [/* 79 vars */]) = 0
syscall_318(0x7fff951e85f0, 0x10, 0, 0x560ef8264fa0, 0x560ef81f6c20, 0) = 0x10
syscall_318(0x7fff951e8fb0, 0x10, 0, 0x560ef8232370, 0, 0x560ef6138760) = 0x10
[pid  1735] execve("/bin/sh", ["sh", "-c", "pygmentize -o out.html in.html 2"...], [/* 79 vars */]) = 0
[pid  1736] execve("/usr/bin/pygmentize", ["pygmentize", "-o", "out.html", "in.html"], [/* 79 vars */])
= 0

This looks different! Instead two execve calls, we have three! The first is the one starting Ruby. But the second starts a shell process, /bin/sh. We can see that the arguments to the shell process include the command we wrote.

The third execve is for the pygmentize command itself. This is a subcommand of the shell process started before it.

In total, we a tree of three processes: the Ruby process, a subshell, and the pygmentize command. Whereas before we added some redirection, there were only two processes.

So what's going on here?

What has happened is that we've run into one of Ruby's special "glue language" features. When we give ruby an external command to execute, Ruby first takes a look at the command. If it looks like a simple command with arguments, Ruby executes it directly. But, if Ruby sees any special shell syntax in the command, it does something different. It instead starts a subshell, so that the shell program can correctly interpret the shell syntax. That's why, when we added a shell redirect to the command, we suddenly started seeing two subprocesses instead of just one.

This feature can make life easier for converting system automation shell scripts to Ruby scripts, since it means that a lot of complex shell commands will "just work". But if we aren't expecting this behavior it can lead to confusion, bugs, and even security vulnerabilities.

Ruby doesn't look only for shell redirection syntax when deciding how to execute a subcommand. For instance, let's say we have a script that executes the cowsay command.

system "cowsay \"Want to make some money?\""

When we execute this, it does what we expect, assuming we know about the the cowsay command.

$ ruby cowsay.rb
< Want to make some money? >
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Now let's make a small change to the script.

system "cowsay \"Want to make some $$$?\""

This time, the command doesn't give us the output we expect.

$ ruby cowsay-dollars.rb
< Want to make some 60810 >
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Why? Because Ruby saw something in the command that it interpreted as shell syntax. Specifically, "dollar dollar" is a shell special variable referring to the current process ID. And "dollar question" is the exit value of the last executed command, if any. By "helpfully" running this command inside a shell process, Ruby wound up breaking our script.

In this case, the outcome is merely wrong and surprising. But we could easily imagine a scenario where this "feature" introduced serious security vulnerabilities. For instance, what if the cow's message was specified by an interpolated variable, and the variable's value came from an untrusted user?

message = "This doesn'\t seem safe"
system "cowsay \"#{message}\""

The user might specify a value that escapes the string and executes arbitrary commands!

message = "Bye bye!\"; rm cowsay-malicious.rb; echo \""
system "cowsay \"#{message}\""

When we execute this, the program gives us a cheeky message and then deletes itself.

$ ruby cowsay-malicious.rb
< Bye bye! >
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
$ ls cowsay-malicious.rb
ls: cannot access cowsay-malicious.rb: No such file or directory

There is one way to ensure that nothing like this ever happens. Let's look at how we can make this program safe.

Instead of passing a single string to system, we pass multiple arguments. The first is the command itself, and the rest are treated as flags or arguments to the command.

message = "Bye bye!\"; rm cowsay-safe.rb; echo \""
system("cowsay", message)

When we run this script, we can see that the whole message, including the special shell syntax intended to escape the string quotation, has been faithfully preserved in the output.

What we can see here is that when we break apart the command and its arguments into separate parameters to the system method, we override Ruby's usual "magical" detection. Instead, it faithfully executes the command, passing along any arguments as un-interpreted strings directly to the command. They never have an opportunity to be mis-interpreted by a shell, because no shell is ever involved.

We can confirm this with strace. There is no execve for /bin/sh this time around.

$ strace -f -e execve -e signal= -qq ruby cowsay-safe.rb
execve("/home/avdi/.rubies/ruby-2.3.0/bin/ruby", ["ruby", "cowsay-safe.rb"], [/* 79 vars */]) = 0
syscall_318(0x7ffe34f7a860, 0x10, 0, 0x55cfffc17eb0, 0x55cfffba9c20, 0) = 0x10
syscall_318(0x7ffe34f7b220, 0x10, 0, 0x55cfffbe6368, 0, 0x55cffda0a760) = 0x10
[pid 23066] execve("/usr/games/cowsay", ["cowsay", "Bye bye!\"; rm cowsay-safe.rb; ec"...], [/* 79 vars
*/]) = 0
< Bye bye!"; rm cowsay-safe.rb; echo " >
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

By the way, this technique—where we pass a list of arguments instead of a single string—works for most Ruby methods that start a subprocess, not just for the system method.

Ruby tries very hard to "do what we mean". That extends to executing commands in a shell, when it seems like that's what we intended. Unfortunately, this can sometimes lead to surprising results. The more we know about Ruby's special rules, the more we can take advantage of them without being taken unawares. Happy hacking!