In Progress
Unit 1, Lesson 21
In Progress

Gem Require

Video transcript & code

When we last left off, we were talking about requiring features. We looked at how Ruby searches the $LOAD_PATH for files matching the name of a requested feature. You might have noticed that so far we've barely mentioned Ruby Gems.

That's because the functionality we've explored so far predates the concept of RubyGems. For many years, if you wanted to load up a library other than a Ruby standard library, you had two options: either modify the load path to include a new directory where the library files might be found; or install the library files into one of Ruby's standard load path directories. This is the purpose of the "vendor_ruby" and "site_ruby" directories listed in the default load path.

$LOAD_PATH
# => ["/home/avdi/.rubies/ruby-2.1.2/lib/ruby/site_ruby/2.1.0",
#     "/home/avdi/.rubies/ruby-2.1.2/lib/ruby/site_ruby/2.1.0/x86_64-linux",
#     "/home/avdi/.rubies/ruby-2.1.2/lib/ruby/site_ruby",
#     "/home/avdi/.rubies/ruby-2.1.2/lib/ruby/vendor_ruby/2.1.0",
#     "/home/avdi/.rubies/ruby-2.1.2/lib/ruby/vendor_ruby/2.1.0/x86_64-linux",
#     "/home/avdi/.rubies/ruby-2.1.2/lib/ruby/vendor_ruby",
#     "/home/avdi/.rubies/ruby-2.1.2/lib/ruby/2.1.0",
#     "/home/avdi/.rubies/ruby-2.1.2/lib/ruby/2.1.0/x86_64-linux"]

However, installing library files directly into a global site_ruby or vendor_ruby directory wasn't an ideal solution. If library authors weren't careful, they could easily cause files from other libraries to be overwritten. There wasn't a clean way to uninstall a library. And there was no way at all to have multiple versions of a library installed simultaneously.

These and a host of other problems were addressed by the introduction of RubyGems: a standard and a set of tools for discretely packaging Ruby libraries for distribution and installation. With RubyGems, Ruby source files from different libraries wouldn't all be thrown together willy-nilly. Instead, inside a central gem directory each gem version would have its own top-level directory, inside of which there would be directories for libraries, documentation, etc.

For example, if I list the files in my local gem home directory for the 2.1.2 version of Ruby, I can see all of my installed gems. Each directory has a gem name followed by a version. I can drop down into any of these, for example the progressbar gem, and see a lib directory, a directory for tests, the project's README, and so on; all neatly contained inside a single directory.

/home/avdi/.gem/ruby/2.1.2/gems:
total used in directory 440 available 37044660
drwxrwxr-x 106 avdi avdi 20480 Aug 11 16:04 .
drwxrwxr-x   9 avdi avdi  4096 Jul 28 20:44 ..
drwxrwxr-x   3 avdi avdi  4096 Jul 29 22:39 activesupport-3.2.15
drwxrwxr-x   3 avdi avdi  4096 Jul 29 22:49 activesupport-4.1.2
drwxrwxr-x   6 avdi avdi  4096 Jul 28 20:44 bundler-1.6.5
drwxrwxr-x   3 avdi avdi  4096 Jul 28 21:24 bundler-unload-1.0.2
drwxrwxr-x   6 avdi avdi  4096 Jul 29 22:39 chunky_png-1.2.9
drwxrwxr-x   6 avdi avdi  4096 Jul 29 22:49 chunky_png-1.3.1
drwxrwxr-x   5 avdi avdi  4096 Aug  5 13:34 coderay-1.1.0
drwxrwxr-x   3 avdi avdi  4096 Jul 29 22:39 coffee-script-2.2.0
drwxrwxr-x   3 avdi avdi  4096 Jul 29 22:39 coffee-script-source-1.6.3
drwxrwxr-x   3 avdi avdi  4096 Jul 29 22:49 coffee-script-source-1.7.0
drwxrwxr-x   8 avdi avdi  4096 Jul 29 22:39 compass-0.12.2
drwxrwxr-x   8 avdi avdi  4096 Jul 29 22:49 compass-0.12.6
drwxrwxr-x   4 avdi avdi  4096 Jul 29 22:49 compass-import-once-1.0.4
drwxrwxr-x   7 avdi avdi  4096 Jul 28 20:51 diff-lcs-1.2.5
drwxrwxr-x   5 avdi avdi  4096 Jul 28 20:51 doc_raptor-0.3.2
drwxrwxr-x   4 avdi avdi  4096 Jul 28 20:51 dotenv-0.8.0
...
/home/avdi/.gem/ruby/2.1.2/gems/progressbar-0.21.0:
total used in directory 152 available 37056456
drwxrwxr-x   4 avdi avdi  4096 Aug 11 16:04 .
drwxrwxr-x 107 avdi avdi 20480 Aug 11 20:06 ..
-rw-r--r--   1 avdi avdi  3821 Aug 11 16:04 ChangeLog
-rw-r--r--   1 avdi avdi    98 Aug 11 16:04 Gemfile
-rw-r--r--   1 avdi avdi   394 Aug 11 16:04 Gemfile.lock
-rw-r--r--   1 avdi avdi   171 Aug 11 16:04 .gitignore
drwxrwxr-x   2 avdi avdi  4096 Aug 11 16:04 lib
-rw-r--r--   1 avdi avdi    98 Aug 11 16:04 LICENSE
-rw-r--r--   1 avdi avdi  1229 Aug 11 16:04 progressbar.gemspec
-rw-r--r--   1 avdi avdi   268 Aug 11 16:04 Rakefile
-rw-r--r--   1 avdi avdi  3839 Aug 11 16:04 README.rdoc
-rw-r--r--   1 avdi avdi     6 Aug 11 16:04 .ruby-version
drwxrwxr-x   2 avdi avdi  4096 Aug 11 16:04 test
-rw-r--r--   1 avdi avdi    53 Aug 11 16:04 .travis.yml

You might be wondering how I knew where to find these gems. Similarly to $LOAD_PATH, Ruby also has a path of directories in which to search for gems. We can discover this search path from within a Ruby program using Gem.path.

Gem.path
# => ["/home/avdi/.gem/ruby/2.1.2",
#     "\"/home/avdi/.rubies/ruby-2.1.2/lib/ruby/gems/2.1.0\""]

But we can also find out the gem search path from the command line by entering gem env. This reveals quite a lot of useful information, including the current gem search paths.

$ gem env
RubyGems Environment:
  - RUBYGEMS VERSION: 2.2.2
  - RUBY VERSION: 2.1.2 (2014-05-08 patchlevel 95) [x86_64-linux]
  - INSTALLATION DIRECTORY: /home/avdi/.gem/ruby/2.1.2
  - RUBY EXECUTABLE: /home/avdi/.rubies/ruby-2.1.2/bin/ruby
  - EXECUTABLE DIRECTORY: /home/avdi/.gem/ruby/2.1.2/bin
  - SPEC CACHE DIRECTORY: /home/avdi/.gem/specs
  - RUBYGEMS PLATFORMS:
    - ruby
    - x86_64-linux
  - GEM PATHS:
     - /home/avdi/.gem/ruby/2.1.2
     - /home/avdi/.rubies/ruby-2.1.2/lib/ruby/gems/2.1.0
     - "/home/avdi/.rubies/ruby-2.1.2/lib/ruby/gems/2.1.0"
  - GEM CONFIGURATION:
     - :update_sources => true
     - :verbose => true
     - :backtrace => false
     - :bulk_threshold => 1000
  - REMOTE SOURCES:
     - https://rubygems.org/
  - SHELL PATH:
     - .bundle/bin
     - /home/avdi/.gem/ruby/2.1.2/bin
     - /home/avdi/.rubies/ruby-2.1.2/lib/ruby/gems/2.1.0/bin
     - /home/avdi/.rubies/ruby-2.1.2/bin
     - /usr/local/heroku/bin
     - /home/avdi/.cabal/bin
     - /home/avdi/.gem/ruby/2.1.2/bin
     - /home/avdi/.rubies/ruby-2.1.2/bin
     - .bundle/bin
     - /usr/local/sbin
     - /usr/local/bin
     - /usr/sbin
     - /usr/bin
     - /sbin
     - /bin
     - /usr/games
     - /usr/local/games

For a more compact view of just the gem path, we can use the gem env path command.

~$ gem env path
/home/avdi/.gem/ruby/2.1.2:/home/avdi/.rubies/ruby-2.1.2/lib/ruby/gems/2.1.0:"/home/avdi/.rubies/ruby-2.1.2/lib/ruby/gems/2.1.0"

Of course, this search path works a little differently from the $LOAD_PATH. We can find Ruby files directly using the $LOAD_PATH. But with the GEM_PATH, all we can find are top-level gem container directories. In order to figure out how to look for Ruby files inside the gem, we then need to consult the .gemspec file. Here, a require_paths attribute tells us which directories inside the gem should be added to $LOAD_PATH. Only after we have added these directories to the Ruby $LOAD_PATH can we begin to search for actual Ruby source files to load from within this gem.

The implementors of RubyGems realized that, for it to be seamlessly usable, they would need to make Ruby's require act a lot smarter. And that's exactly what they did: when the RubyGems system is loaded, it overrides and augments Ruby's built-in require to be RubyGems-aware.

But wait a minute… what's with this distinction between "built-in" require and RubyGems? In recent versions of Ruby, rubygems comes baked-in, right?

Well, yes and no. While the RubyGems code is now distributed with Ruby, and loaded by default, it is still structured as an extension to Ruby's built-in code-loading logic. If we take a look in the rubygems library source code, we can see that it still contains code to remove or alias the existing implementation of require, and substitute its own, much fancier implementation of require.

module Kernel

  RUBYGEMS_ACTIVATION_MONITOR = Monitor.new # :nodoc:

  if defined?(gem_original_require) then
    # Ruby ships with a custom_require, override its require
    remove_method :require
  else
    ##
    # The Kernel#require from before RubyGems was loaded.

    alias gem_original_require require
    private :gem_original_require
  end

  ##
  # When RubyGems is required, Kernel#require is replaced with our own which
  # is capable of loading gems on demand.
  #
  # When you call <tt>require 'x'</tt>, this is what happens:
  # * If the file can be loaded from the existing Ruby loadpath, it
  #   is.
  # * Otherwise, installed gems are searched for a file that matches.
  #   If it's found in gem 'y', that gem is activated (added to the
  #   loadpath).
  #
  # The normal <tt>require</tt> functionality of returning false if
  # that file has already been loaded is preserved.

  def require path
    # ...
  end
  # ...
end

So let's examine what happens when we require a feature contained within a gem. Note that when I say "require a feature within a gem", my phrasing is deliberate. It's common to say we "require a gem". But in fact, when we use require we are always requiring a specific feature, which as you will recall from the last episode will be resolved to an individual file.

Just to drive this point home, instead of requiring a name that directly matches that of a gem, we'll require a feature corresponding to a file inside a gem's library hierarchy. Specifically, we'll require rake/file_list.

Before we do the require, we'll examine the state of the system. First off, we'll list out the names and versions of currently activated gems. An activated gem is one that has been made available to the system for code loading. Before we've required any files, this list is empty.

We'll also save the current value of the $LOAD_PATH and $LOADED_FEATURES variables. We don't care about the starting values of these; we just want to see how they change.

After requiring rake/file_list we take another look at the list of activated gems. This time, we can see that the rake gem has been activated.

Next we check to see what has been added to the $LOAD_PATH. We find that this list now contains the full path of the lib directory inside the rake gem directory.

Now let's look at how the list of $LOADED_FEATURES has changed. We see a longer list here. At the bottom we find the concrete file that corresponds to the feature we requested, file_list.rb. All the other files are ones that were in turn required, either directly or indirectly, by file_list.rb.

Gem::Specification.select(&:activated?).map{|gs| gs.full_name }
# => []

old_load_path = $LOAD_PATH.dup
old_loaded_features = $LOADED_FEATURES.dup

require "rake/file_list"

Gem::Specification.select(&:activated?).map{|gs| gs.full_name }
# => ["rake-10.3.2"]

$LOAD_PATH - old_load_path
# => ["/home/avdi/.gem/ruby/2.1.2/gems/rake-10.3.2/lib"]

$LOADED_FEATURES - old_loaded_features
# => ["/home/avdi/.gem/ruby/2.1.2/gems/rake-10.3.2/lib/rake/cloneable.rb",
#     "/home/avdi/.rubies/ruby-2.1.2/lib/ruby/2.1.0/x86_64-linux/etc.so",
#     "/home/avdi/.rubies/ruby-2.1.2/lib/ruby/2.1.0/fileutils.rb",
#     "/home/avdi/.gem/ruby/2.1.2/gems/rake-10.3.2/lib/rake/file_utils.rb",
#     "/home/avdi/.gem/ruby/2.1.2/gems/rake-10.3.2/lib/rake/file_utils_ext.rb",
#     "/home/avdi/.gem/ruby/2.1.2/gems/rake-10.3.2/lib/rake/ext/core.rb",
#     "/home/avdi/.gem/ruby/2.1.2/gems/rake-10.3.2/lib/rake/ext/string.rb",
#     "/home/avdi/.gem/ruby/2.1.2/gems/rake-10.3.2/lib/rake/pathmap.rb",
#     "/home/avdi/.gem/ruby/2.1.2/gems/rake-10.3.2/lib/rake/file_list.rb"]

Looking at this information, we can see that the RubyGems-enhanced require performs several functions when it is invoked.

First, it searches through all the source files in all of the gems that it knows about, looking for a file that corresponds to the feature that was requested. If there are multiple versions of a gem, it gives preference to newer versions. When it finds a gem with the needed file, it then proceeds to activate the gem. Among other things, this means that the gem's library directories are added to the ruby load_path.

After gem activation, require proceeds more or less as it would without RubyGems: it locates the file to be loaded within the now-expanded $LOAD_PATH, reads it in, and evaluates it. This may result in further invocations of require, in which case the cycle repeats recursively.

Now, we said earlier that require is all about loading individual features, which effectively means individual files. But ordinarily, when working with gems we simply require the name of the gem. So what's going on here?

If we look inside the rake gem directory, and then dig into the lib directory inside it, we can see that there is a file named rake.rb. When we simply require rake, Ruby looks through the gems it knows about and finds this file named rake.rb, and proceeds to activate the rake gem and load that file.

If we look inside any given gem, we'll most likely find something similar: a top-level Ruby file with the same name as the gem. This is not required; it's just a convention. And by that convention, this file is responsible for then loading all of the other files inside the gem. By making an eponymous top-level file available, gem authors make it possible to require the name of a gem and effectively load the entire gem in one shot. But again, this is all by convention; this gem layout is not mandatory.

We've talked about how RubyGems is separate from the built-in Ruby version of require, but also loaded by default. You might be wondering how we can experience the original, "vanilla" version of require given that RubyGems is always loaded on Ruby startup. The answer is to use a special command-line option to Ruby.

Here's a quick demonstration. If I run Ruby from the command line, telling it to load and then output the version of Rake, I get version 10.3.2. But if I run the same command with the option --disable=gems added, I see a different version, 10.1.0. The first version is the one I have installed via RubyGems. The second is the older version of Rake that comes bundled with Ruby. When Ruby is told not to load RubyGems, it can only find and load the standard library version of Rake.

$ ruby -r rake -e 'puts Rake::VERSION'
10.3.2
$ ruby --disable=gems -r rake -e 'puts Rake::VERSION'
10.1.0

So that's how we disable RubyGems on a version of Ruby since 1.9. What if we want to ensure RubyGems is enabled even on versions of Ruby prior to 1.9? In that case, we need to be sure to explicitly require rubygems itself before requiring files from gems. This is a pattern you'll often see in older applications, and in libraries that are intended to work on a broad variety of Ruby versions.

require "rubygems"
require "rake"

Hopefully you now have a better picture of what is going on when we use require. Loading files is a big topic, and there is more to be said on it. But I think this is enough for today. Happy hacking!

Responses