In Progress
Unit 1, Lesson 1
In Progress

Subprocesses Part 10: Open Pipe

Have you ever used the shell to pipe output from one program into another? In this episode, you’ll learn how to take control of this process from inside the source program. You’ll see how a surprising trick of Ruby’s open() call can open a subprocess as if it were a writable file, and how to use that subprocess as a replacement standard-output stream.

Video transcript & code

If you've ever used the git command line tools, you know that they have a useful behavior at the command line: if you run a command that would print more than a screen's worth of output, they automatically pipe themselves into a pager so you can read a screen at a time.

In order to do this, git automatically senses whether it is being used at the console. If it's being run as a headless command or as part of a pipeline, it skips the auto-paging.

On RubyTapas, we've explored how to mimic this behavior in Ruby. In Episode #105, we saw that we could use the $stdout.tty? predicate to sense whether our program was being run from an interactive terminal.

In this example code, we're simulating some voluminous self-documentation output using the "Faker" gem, which we then word-wrap using the "Lovely Rufus" gem from Episode #355.

In order to show this output a screenful at a time when run in an interactive terminal, we're using the same technique we used in Episode #105: we re-execute the program, but with a pipe into the more program appended to the command. (If you're not clear on the exec method used here, check out Episode #457)

require "faker"
require "lovely_rufus"

if ARGV.include?("--help")
  if $stdout.tty?
    exec "ruby #{$0} --help | more"
  end
  text = Faker::Lorem.paragraphs(30).join("\n\n")
  puts LovelyRufus::TextWrapper.wrap(text, width: 67)
end

If we run this at the command line, we can see that it all comes together into paged output.

It works fine, but it seems a little clunky to have to re-execute the whole program just to pipe output into a pager. And while in this example we just hardcoded the command-line arguments for the second invocation, what about situations where we want to auto-paginate any possible output the program might have, no matter how the user invokes it?

Ideally, instead of re-executing this program, we'd just open up an output stream into the pager program and use that as our standard out instead of the default standard out. But how can we do this?

To answer that question, let's do some exploring.

Ordinarily, we use the open method to open up files on the disk for reading or writing.

But open has a trick up its sleeve. If we put a pipe (|) in front of the filename, it behaves a little differently.

Let's say we want to invoke the PHP Hypertext Preprocessor language from inside Ruby. We put the name of the php executable after the pipe. We tell open we want to both read and write from PHP. And we assign the resulting object to a variable.

The object we get back is an IO object. But this IO object doesn't represent an ordinary file. Instead, it represents a subprocess that Ruby has started for us.

Let's write some valid PHP to our subprocess.

Then let's read back the output.

We're going to run this code at the command line first, and you'll see why in a moment.

php = open("|php", "r+")
# => #<IO:fd 6>
php.write <<EOF
One plus one is: <?php print 1 + 1; ?>
EOF
php.read
# => "One plus one is: 2"

Because when we execute it, the process hangs without completing.

This is because, at least on the Linux platform, the read end of a pipe normally hangs until the write end has been closed.

Let's modify the code to close the write end of the pipe before reading.

When we run this, , we can see we've read back in the result of PHP's text processing.

Now, what if instead of reading back the output, we just wanted it to go straight to the standard output stream? In that case, we can open the subprocess in write-mode instead of read-write mode.

php = open("|php", "w")
php.write <<EOF
One plus one is: <?php print 1 + 1; ?>
EOF

Once again we run this at the command line. And we see something weird… we see that the PHP output is inserted after the next prompt! What happened here?

Well, since we once again omitted any call to close the input end of the pipe, the operating system had to close it for us after the process finished. Which, in this case, caused the PHP output to be belated flushed after the next command prompt had already been printed.

At least, that's what we see on this particular machine. On other platforms, this problem can cause the output to be lost entirely.

Once again, we need to remember to close the pipe in order for it to be flushed in a timely fashion.

php = open("|php", "w")
php.write <<EOF
One plus one is: <?php print 1 + 1; ?>
EOF
php.close

# >> One plus one is: 2

This time, we see our PHP output show up where it's supposed to. We wrote text into a PHP process, and the result of that processing was written directly to standard out.

OK, back to our problem of auto-paginating output.

Now we know a trick for opening up a helper program as an output stream. Let's replace the self-exec code with a line that creates a new pager IO object, in write-mode.

Remember, in our previous PHP example we saw that we needed to ensure the subprocess was closed in order to see our output. The same is true here, except in this case we want to make sure that the IO object gets closed no matter what happens later on in the program. In order to ensure this happens, we'll use at_exit to set up a Ruby handler that automatically closes the pager when our program exits.

Now all we have to do is make this pager object the new global standard output object.

From this point onward, all of the program's output will be automatically piped through the pager program, with no extra code needed.

require "faker"
require "lovely_rufus"

if ARGV[0] == "--help"
  if $stdout.tty?
    pager = open("|more", "w")
    at_exit do pager.close end
    $stdout = pager
  end
  text = Faker::Lorem.paragraphs(30).join("\n\n")
  puts LovelyRufus::TextWrapper.wrap(text, width: 67)
end

We can confirm this by running the program at the console.

By the way, in case you're wondering if this is one of those fancy IO tricks that only works on certain UNIX-like operating systems: I originally wrote and tested all the code you see here on a Windows system. That's one of the reasons I like this solution: re-executing subshells can be prone to platform portability bugs, this ability to open a pipe to a subprocess is built-in to Ruby and works across platforms.

And that's all for today. Happy hacking!

Responses