In Progress
Unit 1, Lesson 21
In Progress

Special Variables Part 4: Speak English!

In this series you’ve learned about Ruby’s terse, Perl-style special variables for program state, I/O, and regular expression matching. You’ve seen how these variables can be handy in one-off command line scripts, as well as how they can render code unreadable. Today, you’ll learn about a way to use all of these variables without sacrificing legibility.

Video transcript & code

Over the last few episodes we've gone on a tour of Ruby's special "Perl-style" global and pseudo-global variables.

We've seen variables for the overall process status, like the program name and process ID.

$0                              # => "xmptmp-in11924IXK.rb"
$$                              # => 8680

We've seen variables related to input and output of structured text data, like the input record separator.

ingredients = "water, barley, yeast, hops"
$; = ", "
ingredients.split
# => ["water", "barley", "yeast", "hops"]

And we've seen special variables set by regular expression matches.

pattern = /\((\d{3})\) (\d{3})-(\d{4})/
text    = "Call me! (555) 867-5309 (Jenny)"

pattern =~ text

$1                              # => "555"
$2                              # => "867"
$3                              # => "5309"
$&                              # => "(555) 867-5309"
$~
# => #<MatchData "(555) 867-5309" 1:"555" 2:"867" 3:"5309">
$`                              # => "Call me! "
$'                              # => " (Jenny)"

Some of these funny-looking little variables have strongly mnemonic names. For instance, the pre-match and post-match string variables use the back-quote and quote characters, suggesting that they bracket the last regular expression match in the same way quotes bracket a string.

But mnemonics aside, there's no denying that filling a program with these variables can quickly lead to terse but unreadable code. Consider the one-liner for extracting names from a mailing list that we came up with in Episode #492:

ruby -p -a -e 'BEGIN { $; = ","; $\ = "\n" }; $_ = $F[0];' < list.csv

This code isn't exactly intention-revealing. If we continued this style into a full-length script, we'd have the "write-only code" that Perl is infamous for. Which is why a lot of Ruby style guides ban the use of these shorthand variables.

But some of these special variables are useful, and a few are downright essential! So, what's the alternative?

Well, a few of these variables have built-in aliases which are longer and more readable.

For instance, the program name variable has an alias.

$0                              # => "xmptmp-in11924kCB.rb"
$PROGRAM_NAME                   # => "xmptmp-in11924kCB.rb"

And so does the load path.

$:.last(3)
# => ["C:/tools/ruby23/lib/ruby/vendor_ruby",
#     "C:/tools/ruby23/lib/ruby/2.3.0",
#     "C:/tools/ruby23/lib/ruby/2.3.0/x64-mingw32"]
$LOAD_PATH.last(3)
# => ["C:/tools/ruby23/lib/ruby/vendor_ruby",
#     "C:/tools/ruby23/lib/ruby/2.3.0",
#     "C:/tools/ruby23/lib/ruby/2.3.0/x64-mingw32"]

But only a few of these more readable variable aliases exist. Well, out of the box anyway.

Fortunately, Ruby ships with a library to remedy this problem.

When we include the English library (being careful to spell it properly, with a capital "E") …

…we get more readable aliases for all of Ruby's special variables.

For instance, the process ID:

require "English"
$$                              # => 18576
$PROCESS_ID                     # => 18576

The currently active exception

require "English"
begin
  fail "Oh no!"
rescue
  $!                            # => #<RuntimeError: Oh no!>
  $ERROR_INFO                   # => #<RuntimeError: Oh no!>
end

The default field separator:

require "English"
$; = ", "
$FIELD_SEPARATOR          # => ", "

And the last regular expression match string:

require "English"

pattern = /\((\d{3})\) (\d{3})-(\d{4})/
text    = "Call me! (555) 867-5309 (Jenny)"

pattern =~ text

$&                            # => "(555) 867-5309"
$MATCH                        # => "(555) 867-5309"

But how do we find out which variables are aliased to what? Well, one way would be to go to the online documentation for the English module. But an even better way is to simply read the code!

To do that, we'll use some code swiped from Episode #235 to find out where the library is located on our system.

def where(feature)
  globs = $LOAD_PATH.map{|base_dir|
    File.expand_path("#{feature}.*", base_dir)
  }
  files = Dir.glob(globs)
  library = files.first
end

where("English")
# => "C:/tools/ruby23/lib/ruby/2.3.0/English.rb"

Then we'll open it in our editor.

What we can see here is that the file starts off with a handy table of all of the variable aliases defined in the English library.

Below this, we can see the actual variable alias definitions. These use the same alias keyword you might have seen or even used to define method aliases. This is another example of Ruby's policy of having (depending on how you look at it) either no magic, or of making all the magic available to users. There's no special C-level hackery involved in creating these variable aliases.

What we can also see here is that this file serves as an excellent source of documentation on the purpose of the various special variables. If you want to see a complete list of all of Ruby's special Perl-style variables, along with their English alias and documentation of what they are for, this library source code is the best place to find it.

Now that we know about the English library, let's revisit the code we started out with in Episode #491.

module CrashLogger
  def self.log_crash_info(error=$!)
    program_name = $0
    process_id   = $$
    timestamp    = Time.now.utc.strftime("%Y%m%d-%H%M%S")
    filename     = "crash-#{program_name}-#{process_id}-#{timestamp}.yml"

    error_info = {}
    error_info["error"]         = error
    error_info["stacktrace"]    = error.backtrace
    error_info["environment"]   = ENV.to_h

    File.write(filename, error_info.to_yaml)
    filename
  end
  # ...
end

If we require the English library, we can clarify this crash logger module by expanding Perl-style variables to their English equivalents.

$! becomes $ERROR_INFO

$0 becomes $PROGRAM_NAME (which we technically don't need the English library for).

And $$ becomes $PROCESS_ID

module CrashLogger
  def self.log_crash_info(error=$ERROR_INFO)
    program_name = $PROGRAM_NAME
    process_id   = $PROCESS_ID
    timestamp    = Time.now.utc.strftime("%Y%m%d-%H%M%S")
    filename     = "crash-#{program_name}-#{process_id}-#{timestamp}.yml"

    error_info = {}
    error_info["error"]         = error
    error_info["stacktrace"]    = error.backtrace
    error_info["environment"]   = ENV.to_h

    File.write(filename, error_info.to_yaml)
    filename
  end
  # ...
end

The result is a method that describes its intent a little better.

This brings us to the end of our series on Perl-style special variables in Ruby. We've seen what they are, and the various purposes they serve. We've learned about where they come from. We've seen how they can be handy when composing one-liners at the command line, and how they can be confusing in most other circumstances. And now we've seen how to use the same variables, but in the form of long, meaningful aliases.

Until next time, happy hacking!


Editor's Note:

The "unreadable code" in Episode #492 becomes:

ruby -p -a -e 'BEGIN {$FIELD_SEPARATOR = ",";$INPUT_RECORD_SEPARATOR = "\n" };$LAST_READ_LINE = $F[0];' < list.csv

 

Responses