In Progress
Unit 1, Lesson 1
In Progress

Subprocesses Part 2: Command Input Operator

Video transcript & code

In the first episode of this series, we determined that for running simple one-shot subprocesses, the Kernel#system method was the best tool for the job. If we don't care about a command's output, system is ideal.

But what about when we need to capture a command's output?

For instance, the mediainfo command is a utility for extracting detailed metadata from audio and video files.

$ mediainfo "412 autovivification.mp4"
General
Complete name                            : 412 autovivification.mp4
Format                                   : MPEG-4
Format profile                           : Base Media / Version 2
Codec ID                                 : mp42 (mp42/mp41)
File size                                : 22.8 MiB
Duration                                 : 5mn 3s
Overall bit rate mode                    : Variable
Overall bit rate                         : 631 Kbps
Encoded date                             : UTC 2016-04-26 13:18:17
Tagged date                              : UTC 2016-04-26 13:18:20
©TIM                                     : 00;00;00;00
©TSC                                     : 30000
©TSZ                                     : 1001

Video
ID                                       : 1
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : Main@L3.2
Format settings, CABAC                   : Yes
Format settings, ReFrames                : 3 frames
Codec ID                                 : avc1
Codec ID/Info                            : Advanced Video Coding
Duration                                 : 5mn 3s
Bit rate                                 : 437 Kbps
Width                                    : 1 280 pixels
Height                                   : 720 pixels
Display aspect ratio                     : 16:9
Frame rate mode                          : Constant
Frame rate                               : 29.970 (30000/1001) fps
Standard                                 : NTSC
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.016
Stream size                              : 15.8 MiB (69%)
Language                                 : English
Encoded date                             : UTC 2016-04-26 13:18:17
Tagged date                              : UTC 2016-04-26 13:18:17
Color range                              : Limited
Color primaries                          : BT.709
Transfer characteristics                 : BT.709
Matrix coefficients                      : BT.709

Audio
ID                                       : 2
Format                                   : AAC
Format/Info                              : Advanced Audio Codec
Format profile                           : LC
Codec ID                                 : 40
Duration                                 : 5mn 3s
Source duration                          : 5mn 3s
Bit rate mode                            : Variable
Bit rate                                 : 192 Kbps
Maximum bit rate                         : 257 Kbps
Channel(s)                               : 2 channels
Channel positions                        : Front: L R
Sampling rate                            : 48.0 KHz
Frame rate                               : 46.875 fps (1024 spf)
Compression mode                         : Lossy
Stream size                              : 6.84 MiB (30%)
Source stream size                       : 6.84 MiB (30%)
Language                                 : English
Encoded date                             : UTC 2016-04-26 13:18:17
Tagged date                              : UTC 2016-04-26 13:18:17

Let's say we'd like to run this command from within a Ruby script, and capture the output into a variable.

The easiest way to do this is to use the command input operator, AKA the backtick operator.

When we dump the content of the variable, we can see that we successfully captured the whole output of the command into a string.

video_info = `mediainfo "412 autovivification.mp4"`
puts video_info.lines

# >> General
# >> Complete name                            : 412 autovivification.mp4
# >> Format                                   : MPEG-4
# >> Format profile                           : Base Media / Version 2
# >> Codec ID                                 : mp42 (mp42/mp41)
# >> File size                                : 22.8 MiB
# >> Duration                                 : 5mn 3s
# >> Overall bit rate mode                    : Variable
# >> Overall bit rate                         : 631 Kbps
# >> Encoded date                             : UTC 2016-04-26 13:18:17
# >> Tagged date                              : UTC 2016-04-26 13:18:20
# >> ©TIM                                     : 00;00;00;00
# >> ©TSC                                     : 30000
# >> ©TSZ                                     : 1001
# >>
# >> Video
# >> ID                                       : 1
# >> Format                                   : AVC
# >> Format/Info                              : Advanced Video Codec
# >> Format profile                           : Main@L3.2
# >> Format settings, CABAC                   : Yes
# >> Format settings, ReFrames                : 3 frames
# >> Codec ID                                 : avc1
# >> Codec ID/Info                            : Advanced Video Coding
# >> Duration                                 : 5mn 3s
# >> Bit rate                                 : 437 Kbps
# >> Width                                    : 1 280 pixels
# >> Height                                   : 720 pixels
# >> Display aspect ratio                     : 16:9
# >> Frame rate mode                          : Constant
# >> Frame rate                               : 29.970 (30000/1001) fps
# >> Standard                                 : NTSC
# >> Color space                              : YUV
# >> Chroma subsampling                       : 4:2:0
# >> Bit depth                                : 8 bits
# >> Scan type                                : Progressive
# >> Bits/(Pixel*Frame)                       : 0.016
# >> Stream size                              : 15.8 MiB (69%)
# >> Language                                 : English
# >> Encoded date                             : UTC 2016-04-26 13:18:17
# >> Tagged date                              : UTC 2016-04-26 13:18:17
# >> Color range                              : Limited
# >> Color primaries                          : BT.709
# >> Transfer characteristics                 : BT.709
# >> Matrix coefficients                      : BT.709
# >>
# >> Audio
# >> ID                                       : 2
# >> Format                                   : AAC
# >> Format/Info                              : Advanced Audio Codec
# >> Format profile                           : LC
# >> Codec ID                                 : 40
# >> Duration                                 : 5mn 3s
# >> Source duration                          : 5mn 3s
# >> Bit rate mode                            : Variable
# >> Bit rate                                 : 192 Kbps
# >> Maximum bit rate                         : 257 Kbps
# >> Channel(s)                               : 2 channels
# >> Channel positions                        : Front: L R
# >> Sampling rate                            : 48.0 KHz
# >> Frame rate                               : 46.875 fps (1024 spf)
# >> Compression mode                         : Lossy
# >> Stream size                              : 6.84 MiB (30%)
# >> Source stream size                       : 6.84 MiB (30%)
# >> Language                                 : English
# >> Encoded date                             : UTC 2016-04-26 13:18:17
# >> Tagged date                              : UTC 2016-04-26 13:18:17
# >>
# >>

One thing you might find confusing is that I keep calling this an operator. When, clearly, it is a form of quoting.

The truth is, just as Shimmer is both a floor polish and a dessert topping, the backticks are a weird mash-up of both an operator and a quoting syntax. Technically, there really is a a backtick operator, and it is defined on Kernel. To prove this, we can run a command using full method-call syntax.

now = Kernel.`("date")
now                            # => "Mon May  2 13:09:32 EDT 2016\n"

Whenever we surround a string in backticks, Ruby interprets that as an invocation of the one-argument backtick operator.

now = `date`
now                             # => "Mon May  2 13:11:07 EDT 2016\n"

If you're a longtime viewer of the show, you might recall that in episode #30 we had some fun with redefining the backtick operator.

Let's fill in our understanding of this peculiar operator.

First off, backticks behave like double quotes in the sense that we can interpolate other Ruby values into them. For instance, instead of hardcoding the filename, we could interpolate in the video we want to know about.

filename = "412 autovivification.mp4"
video_info = `mediainfo "#{filename}"`
puts video_info.lines[0..2]


# >> General
# >> Complete name                            : 412 autovivification.mp4
# >> Format                                   : MPEG-4

Second, like all other forms of quoting in Ruby, if we don't want to use literal backticks, we don't have to.

For instance, let's say we wanted to find out what dynamic libraries the mediainfo tool uses. To do this, we might use shell backquotes to interpolate in the output of the which command, and then feed that into the ldd command.

But this is neither readable nor is it valid Ruby syntax.

bin_info = `ldd `which mediainfo``

As a substitute, we can use the prefix %x to invoke the command input operator with our own choice of delimiters. Here, we've chosen parentheses.

bin_info = %x(ldd `which mediainfo`)
puts bin_info.lines[0..2]
# >>    linux-vdso.so.1 =>  (0x00007ffea6960000)
# >>    libmediainfo.so.0 => /usr/lib/x86_64-linux-gnu/libmediainfo.so.0 (0x00007ff399336000)
# >>    libzen.so.0 => /usr/lib/x86_64-linux-gnu/libzen.so.0 (0x00007ff3990f3000)

This example also reveals another important tidbit about the command input operator: just like the system method, if it sees any special shell syntax it runs the command in the context of a shell. We learned more about this automatic shell detection in episode #389.

The command input operator is wonderfully convenient. That said, it has some drawbacks.

First, it's easy to miss. Backticks just don't stand out that obviously, especially when they are mixed in with a lot of other single and double quotes.

Second, unlike with other Ruby shell subprocess commands, there is no way to suppress Ruby's magic shell detection. If the command looks like it might have special shell syntax in it, Ruby will run it through a shell intermediary. And as we learned in episode #389, this has the potential for both confusion and security vulnerabilities.

Third, the command input operator only captures standard output, not standard error.

As an example, let's query a video file for metadata using a different command, called avprobe.

filename = "412 autovivification.mp4"
video_info = `avprobe "#{file_name}"`

# !> ffprobe version 2.8.6-1ubuntu2 Copyright (c) 2007-2016 the FFmpeg developers
# !>   built with gcc 5.3.1 (Ubuntu 5.3.1-11ubuntu1) 20160311
# !>   configuration: --prefix=/usr --extra-version=1ubuntu2 --build-suffix=-ffmpeg --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --cc=cc --cxx=g++ --enable-gpl --enable-shared --disable-stripping --disable-decoder=libopenjpeg --disable-decoder=libschroedinger --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librtmp --enable-libschroedinger --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxvid --enable-libzvbi --enable-openal --enable-opengl --enable-x11grab --enable-libdc1394 --enable-libiec61883 --enable-libzmq --enable-frei0r --enable-libx264 --enable-libopencv
# !>   libavutil      54. 31.100 / 54. 31.100
# !>   libavcodec     56. 60.100 / 56. 60.100
# !>   libavformat    56. 40.101 / 56. 40.101
# !>   libavdevice    56.  4.100 / 56.  4.100
# !>   libavfilter     5. 40.101 /  5. 40.101
# !>   libavresample   2.  1.  0 /  2.  1.  0
# !>   libswscale      3.  1.101 /  3.  1.101
# !>   libswresample   1.  2.101 /  1.  2.101
# !>   libpostproc    53.  3.100 / 53.  3.100
# !> Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '412 autovivification.mp4':
# !>   Metadata:
# !>     major_brand     : mp42
# !>     minor_version   : 0
# !>     compatible_brands: mp42mp41
# !>     creation_time   : 2016-04-26 13:18:17
# !>   Duration: 00:05:03.19, start: 0.000000, bitrate: 631 kb/s
# !>     Stream #0:0(eng): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709), 1280x720, 436 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
# !>     Metadata:
# !>       creation_time   : 2016-04-26 13:18:17
# !>       handler_name    : Alias Data Handler
# !>       encoder         : AVC Coding
# !>     Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 189 kb/s (default)
# !>     Metadata:
# !>       creation_time   : 2016-04-26 13:18:17
# !>       handler_name    : Alias Data Handler

The output of this command was supposed to be stuffed into the video_info variable. But instead, it's been dumped to the screen. Why? Because as it turns out, avprobe outputs to the standard error stream instead of the standard output stream.

In order to capture the output of a command like this, we have to use shell redirection to reroute standard error into standard output.

This time we find our output where we expected, inside the video_info variable.

filename = "412 autovivification.mp4"
video_info = `avprobe "#{filename}" 2>&1`
puts video_info.lines[12..14]

# >> Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '412 autovivification.mp4':
# >>   Metadata:
# >>     major_brand     : mp42

Fourth, because it behaves like a quoting syntax, the command input operator can only ever receive a single string argument. That means there is no way to customize it, the way we can with nearly every other Ruby subprocess-starting method. We can't pass extra environment variables, or chunk arguments into an array, or pass a hash of special options the way we can with a method like Kernel.system.

A final caveat to keep in mind is that, like Kernel.system, the command input operator is a blocking call that waits for the command to finish before returning. If the command might take a long time, or if it might output very large quantities of text, we may want to consider using one of Ruby's more industrial-strength command output capturing facilities. We'll cover some of those in upcoming episodes.

But if all you need to do is to grab a little output from shell command, the command input operator is a quick and easy way to do it. Happy hacking!

Responses