In Progress
Unit 1, Lesson 21
In Progress

Wrapped Load

Video transcript & code

Today we're going to continue talking about loading and evaluating code. So far, we've been talking about loading up external Ruby files in order to use the methods, classes, or constants defined in them. But that's not the only reason we might want to execute the code in another Ruby file.

Consider this file, called "cleanup.rb". It is a script that automates cleanup of certain automatically-generated backup files.

require "fileutils"
PATTERNS = %W[**/*~ **/#*#]
puts "Cleanup up backup files"
Dir[*PATTERNS].each do |file|
  FileUtils.rm(file)
end

One day we decide to re-use this code from within another Ruby program, and rather than factor it out into a method we just want to run the script as-is. We can easily do this with our #load_lib method.

eval File.read("./loader.rb")

load_lib("cleanup.rb")

PATTERNS                        # => ["**/*~", "**/#*#"]
# >> Cleanup up backup files
# >> LOAD cleanup.rb
# >> Cleanup up backup files

There's just one little problem. After we've run the script, our program now has a new constant defined: PATTERNS. In effect, global information from the cleanup.rb script has "leaked" into our main program.

This may just be no more than an annoyance: for instance, if we were to execute the script more than once, we'd see constant redefinition warnings.

eval File.read("./loader.rb")

load_lib("cleanup.rb")
load_lib("cleanup.rb")
$ ruby double_load.rb
LOAD cleanup.rb
Cleanup up backup files
LOAD cleanup.rb
(eval):11: warning: already initialized constant PATTERNS
(eval):11: warning: previous definition of PATTERNS was here
Cleanup up backup files

But it could also cause more serious problems if the script happens to define a constant, global, class, or module that has the same name as one in our main program.

To address this potential problem, we'll need to modify our loading methods. We start with the #try_load_file method. We give it an extra parameter, called context. This is expected to be a class or module within which the file's contents should be evaluated. We change the eval to context.module_eval. While we're at it, we also add some information that was missing in the original version. We pass in the name and starting line number of the file being loaded, so that Ruby can report more helpful stack traces.

Over in the #load_lib method, we add a boolean parameter named wrap, which defaults to false. Then we define a context variable. If the wrap flag is set to true, the evaluation context will be a brand-new, anonymous module. If not, it will be the top-level Object class, within which all top-level constants are considered to be defined. Then we modify both calls to #try_load_file to also pass in the context.

def try_load_file(file, context)
  if File.file?(file)
    puts "LOAD #{file}"
    context.module_eval File.read(file), file, 1
    return true
  else
    return false
  end
end

def load_lib(name, wrap=false)
  context = wrap ? Module.new : ::Object

  return true if try_load_file(name, context)

  $LOAD_PATH.each do |dir|
    file = File.join(dir, name)
    return if try_load_file(file, context)
  end

  fail LoadError, "Library not found: #{name}"
end

If we now call #load_lib with one argument, we get the same result as before: the constants defined inside cleanup.rb leak into our environment. But when we set the wrap flag to true, the code inside the cleanup.rb file is executed within an anonymous, throwaway module. Any top-level constants it defines are implicitly defined within that module, which is then discarded as soon as the evaluation is finished. As a result, after the load the top-level PATTERNS constant remains undefined.

eval File.read("./loader2.rb")

load_lib("cleanup.rb")

PATTERNS                        # => ["**/*~", "**/#*#"]
# >> LOAD cleanup.rb
# >> Cleanup up backup files

eval File.read("./loader2.rb")

load_lib("cleanup.rb", true)

defined?(PATTERNS)              # => nil
# >> LOAD cleanup.rb
# >> Cleanup up backup files

While this feature makes it possible to prevent accidental leakage of constants, it's important to realize that this is far from a perfect isolation. Using leading double-colons to force a constant to be evaluated in the top-level scope will break the encapsulation. Any global variables set or changed in the loaded file will still be around after it is finished being evaluated. And there are any number of other ways the loaded file can still make global changes to the main program. Bearing this in mind, we should be careful about what kind of files we execute in this fashion. If we need more complete isolation, we need to turn to an alternative technique, such as starting a subprocess or using fork and exec.

require "fileutils"
::PATTERNS = %W[**/*~ **/#*#]
$magic_number = 42
puts "Cleanup up backup files"
Dir[*PATTERNS].each do |file|
  FileUtils.rm(file)
end
eval File.read("./loader2.rb")

load_lib("cleanup2.rb", true)

PATTERNS                        # => ["**/*~", "**/#*#"]
$magic_number                   # => 42
# >> LOAD cleanup2.rb
# >> Cleanup up backup files

We now have a robust and useful method for loading Ruby files from disk. As you might have guessed by now, what we've built over the course of this episode is nothing new. In fact, we have more or less duplicated the capabilities of Ruby's built-in #load method, which is defined in the Kernel module. We can replace any of our calls to #load_lib with #load and they will continue to work. This includes supporting the "wrap" flag, which we just implemented because I thought it would be easiest to understand it if we went through the steps of recreating it ourselves.

load("greet.rb")
hello("Bobo")
# >> Hello, Bobo

Of course, this is just one part of the file-loading story in Ruby. In the next episode, we'll move on from loading files, to requiring features. Until then, happy hacking!

Responses