In Progress
Unit 1, Lesson 21
In Progress

Which

Video transcript & code

The other day I was porting a gem so that it would work on Windows. One of the compatibility issues I ran into was this snippet of code, in the Rakefile:

`which bundle`
unless $?.success?
  sh 'gem', 'install', 'bundler'
end

The problem with this code is that it shells out to run the which command, which ordinarily doesn't exist on Windows systems. There is a similar where on Windows systems, but it works a little bit differently. In any way, all this code is trying to do is to determine whether the bundle command is available. I feel like a more robust solution would be to simply perform this check in pure, platform-neutral Ruby code.

So that's what we're going to do in this episode. And in the process, we're going to learn about some of Ruby's features for writing cross-platform code.

So, let's think about exactly what the which command does.

When we run which at the command line, and provide it with the name of an executable, it responds with the full path of the given executable.

$ which ruby
/usr/bin/ruby

In this case, it's showing us that if we ran the Ruby command, the shell would run the binary found at /usr/bin/ruby.

How, exactly, does it do this? Well, it has to look through the directories listed in the path environment variable.

Let's check out the contents of the path environment variable on both a Windows system and an Ubuntu system.

ENV["PATH"]
# => "C:\\tools\\ruby23\\bin;C:\\ProgramData\\Oracle\\Java\\javapath;...
ENV["PATH"]
# => "/home/tapas/.gem/ruby/2.2.3/bin:/home/tapas/.rubies/ruby-2.2.3/bin:...

As you can see here, both platforms have values for this variable, but both the contents and formatting is pretty different.

One of the most obvious differences is the backslashes used to separate subdirectories on the Windows side, whereas Linux uses forward slashes.

(By the way, if you're wondering about the doubled backslashes in the Windows PATH, that's because Ruby is displaying them in a double-quoted string, and backslashes have to be escaped in double-quoted strings.)

If we look more closely, we can also see that individual entries in the path are separated by different characters. On Windows it's a semicolon, and on Linux it's a colon.

So, already we can see some platform variations we're going to have to deal with in order to do this in a portable way.

Now, one way we could approach this is that we could check to see which platform we're running on, and choose a different separator based on the result. But I'm not even going to demo this approach, because there are a lot of problems associated with it.

  1. You'd be surprised how tricky it can be to accurately determine the platform Ruby is running on.
  2. If we ever want to add a third platform, we'd have to revisit all of those conditionals.
  3. As you'll see in moment, we'd just be reiterating work that the Ruby core team has already done for us.

Explicitly switching on the current platform should be a method of last resort.

Instead, let's see what Ruby gives us to solve this problem.

If we examine the File::PATH_SEPARATOR constant, we can see that on Windows, it has a value of ;, while on Ubuntu it has a value of :.

# Ubuntu
File::PATH_SEPARATOR            # => ":"
# Windows
File::PATH_SEPARATOR            # => ";"

This solves our most immediate problem. To break up the path variable into its component directory paths, we can split using the File::PATH_SEPARATOR value as the delimiter.

# Windows
ENV["PATH"].split(File::PATH_SEPARATOR)
# => ["C:\\tools\\ruby23\\bin",
#     "C:\\ProgramData\\Oracle\\Java\\javapath",
#     "C:\\Program Files (x86)\\Intel\\iCLS Client\\",
#     "C:\\Program Files\\Intel\\iCLS Client\\",
#     "C:\\WINDOWS\\system32",
#     "C:\\WINDOWS"
# ...
# Ubuntu
ENV["PATH"].split(File::PATH_SEPARATOR)
# => ["/home/tapas/.gem/ruby/2.2.3/bin",
#     "/usr/local/sbin",
#     "/usr/local/bin",
#     "/usr/sbin",
# ...

Let's focus on the Ubuntu side first. Say we want to find out what concrete executable would be invoked if we were to enter the ping command.

I'm choosing ping because it happens to exist on both Windows and Ubuntu systems.

We split the PATH based on the platform file separator.

Then for ever directory in the PATH, we append the target command name.

This gives us a list of candidate paths.

# Ubuntu
command = "ping"
ENV["PATH"]
  .split(File::PATH_SEPARATOR)
  .map{|d| d + "/" + command}
# => ["/home/tapas/.gem/ruby/2.2.3/bin/ping",
#     "/home/tapas/.rubies/ruby-2.2.3/bin/ping",
#     "/usr/local/sbin/ping",
#     "/usr/local/bin/ping",
#     "/usr/sbin/ping",
#     "/usr/bin/ping",
#     "/sbin/ping",
#     "/bin/ping",
#     "/usr/games/ping",
#     "/usr/local/games/ping"]

Already, though, we've hardcoded some platform-specific assumptions. Remember, the separator for path segments may differ across platforms.

We could substitute the File::SEPARATOR constant.

command = "ping"
ENV["PATH"]
  .split(File::PATH_SEPARATOR)
  .map{|d| d + File::SEPARATOR + command}

But there's a better way to do this. Ruby gives us a File::join method. Not only does it use the platform-correct separator, it also deals cleanly with things like paths that already have a slash on the end.

# Ubuntu
File.join("/usr/bin", "ping") # => "/usr/bin/ping"
File.join("/usr/bin/", "ping")  # => "/usr/bin/ping"

So let's use File.join to do our path concatenation.

command = "ping"
ENV["PATH"]
  .split(File::PATH_SEPARATOR)
  .map{|d| File.join(d, command)}

Now all we have to do is find the first candidate in our list that corresponds to a real executable file.

# Ubuntu
command = "ping"
ENV["PATH"]
  .split(File::PATH_SEPARATOR)
  .map{|d| File.join(d, command)}
  .find{|p| File.executable?(p)}
# => "/bin/ping"

That's simple enough. Now let's try it on the Windows side.

# Windows
command = "ping"
ENV["PATH"]
  .split(File::PATH_SEPARATOR)
  .map{|d| File.join(d, command)}
  .find{|p| File.executable?(p)}
# => nil

Hmmm, this time we got nil. But I promise you, when I actually type ping on the command line and press enter, a program is executed. So what are we missing?

Well, it turns out that on Windows things are a little bit more complicated. Instead of having an executable bit on each file, Windows identifies executables by their extension.

So, just as a quick and dirty example,let's append the extension.exe to each candidate.

# Windows
command = "ping"
ENV["PATH"]
  .split(File::PATH_SEPARATOR)
  .map{|d| File.join(d, command) + ".exe"}
  .find{|p| File.executable?(p)}
# => "C:\\WINDOWS\\system32/ping.exe"

This time, we get a result. We get the path C:\WINDOWS\system32/ping.exe.

You might be looking at that path and thinking that Ruby got something wrong. After all, didn't we just say that the File.join method is supposed to use a platform-appropriate directory separator? But here we have a path that mixes both backslashes and forward slashes.

As it turns out, backslashes and forward slashes are both treated as valid path segment separators on Windows. Paths are normally displayed with backslashes, but Windows is perfectly happy to accept a forward slash instead.

Ruby actually exposes another constant called ALT_SEPARATOR which, on Windows, is a backslash.

# Windows
File::ALT_SEPARATOR # => "\\"
# Ubuntu
File::ALT_SEPARATOR # => nil

So, OK, we have some working code. But it's not very general.

For one thing, what happens if we are looking up the command with the extension, instead of with the extension omitted?

# Windows
command = "ping.exe"
ENV["PATH"]
  .split(File::PATH_SEPARATOR)
  .map{|d| File.join(d, command) + ".exe"}
  .find{|p| File.executable?(p)}
# => nil

This time, the search fails.

We need a way to check for both the exact user-specified command, and the command with a .exe appended. So let's create an array of potential suffixes, starting with the empty string.

Then we'll use the Array#product method we learned about in Episode #459 to get all of combinations of directory and suffix.

In the map, we join together the directory and suffix.

Let's take a look at the output so far.

# Windows
command = "ping.exe"
suffixes = ["", ".exe"]
dirs = ENV["PATH"]
       .split(File::PATH_SEPARATOR)
       .product(suffixes)
       .map{|dir, suffix| File.join(dir, command) + suffix}
# => ["C:\\tools\\ruby23\\bin/ping.exe",
#     "C:\\tools\\ruby23\\bin/ping.exe.exe",
#     "C:\\ProgramData\\Oracle\\Java\\javapath/ping.exe",
#     "C:\\ProgramData\\Oracle\\Java\\javapath/ping.exe.exe",
#     "C:\\WINDOWS\\system32/ping.exe",
#     "C:\\WINDOWS\\system32/ping.exe.exe",
#     "C:\\WINDOWS/ping.exe",
#     "C:\\WINDOWS/ping.exe.exe",
# ...

The upshot is that for every directory in the PATH, we get two candidates: one without .exe appended, and one with.

When we add on the find at the end, we once again locate our quarry.

  # Windows
  command = "ping.exe"
  suffixes = ["", ".exe"]
  dirs = ENV["PATH"]
         .split(File::PATH_SEPARATOR)
         .product(suffixes)
         .map{|dir, suffix| File.join(dir, command) + suffix}
         .find{|p| File.executable?(p)}
# => "C:\\WINDOWS\\system32/ping.exe"

And if we remove the explicit extension from our search, we still get the same result.

  # Windows
  command = "ping"
  suffixes = ["", ".exe"]
  dirs = ENV["PATH"]
         .split(File::PATH_SEPARATOR)
         .product(suffixes)
         .map{|dir, suffix| File.join(dir, command) + suffix}
         .find{|p| File.executable?(p)}
# => "C:\\WINDOWS\\system32/ping.exe"

So, are we done? Not quite. Because there are more kinds of executable file on Windows than just the ones ending in .exe. In fact, Windows keeps a separate list of executable file extensions in a variable called PathExt.

# Windows
ENV["PathExt"]
# => ".COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.RB;.RBW;.RB;.RBW"

As you can see, this list is also delimited by semicolons.

How are we to factor in this expanded set of potential extensions? Well, luckily for us we already have a list of suffixes. All we need to do is add the PathExt suffixes into it.

While we're at it, we can remove our own .exe entry, since that's now redundant.

# Windows
command = "ping"
suffixes = [""]
suffixes.concat(ENV["PathExt"].split(File::PATH_SEPARATOR))
dirs = ENV["PATH"]
       .split(File::PATH_SEPARATOR)
       .product(suffixes)
       .map{|dir, suffix| File.join(dir, command) + suffix}
       .find{|p| File.executable?(p)}
# => "C:\\WINDOWS\\system32/ping.exe"

Now that we are taking into account a larger list of potential executable extensions, we can find other types of executable. For instance, let's look up the bundle command.

# Windows
command = "bundle"
suffixes = [""]
suffixes.concat(ENV["PathExt"].split(File::PATH_SEPARATOR))
dirs = ENV["PATH"]
       .split(File::PATH_SEPARATOR)
       .product(suffixes)
       .map{|dir, suffix| File.join(dir, command) + suffix}
       .find{|p| File.executable?(p)}
# => "C:\\Ruby23-x64\\bin/bundle.BAT"

On Windows, Rubygems creates batch files for executable scripts. We can see this reflected in the fact that the fully-qualified executable path we found is for bundle.bat, not bundle.exe.

And now for the moment of truth: let's see if this code still works on the Ubuntu side. We'll try looking up the ping command first.

# Ubuntu
command = "ping"
suffixes = [""]
suffixes.concat(ENV["PathExt"].split(File::PATH_SEPARATOR)) # ~> NoMethodError: undefined method `split' for nil:NilClass
dirs = ENV["PATH"]
  .split(File::PATH_SEPARATOR)
  .product(suffixes)
  .map{|dir, suffix| File.join(dir, command) + suffix}
  .find{|p| File.executable?(p)}

# ~> NoMethodError
# ~> undefined method `split' for nil:NilClass
# ~>
# ~> xmptmp-in26097icB.rb:3:in `<main>'

Uh oh. Looks like we broke something.

The problem here is that Ubuntu doesn't normally set a PathExt variable, so the result of that ENV lookup is nil.

ENV["PathExt"]                  # => nil

We can fix this by appending a to_s to the variable retrieval.

# Ubuntu
command = "ping"
suffixes = [""]
suffixes.concat(ENV["PathExt"].to_s.split(File::PATH_SEPARATOR))
dirs = ENV["PATH"]
  .split(File::PATH_SEPARATOR)
  .product(suffixes)
  .map{|dir, suffix| File.join(dir, command) + suffix}
  .find{|p| File.executable?(p)}
# => "/bin/ping"

That way, if the return value is nil, it will be converted to the empty string. This is an example of introducing a benign value, which we talked about in Episode #337.

While this works, I want to also show you an alternate approach. We could instead use fetch to supply the empty string as a default when the environment key is missing.

# Ubuntu
command = "ping"
suffixes = [""]
suffixes.concat(ENV.fetch("PathExt", "").split(File::PATH_SEPARATOR))
dirs = ENV["PATH"]
  .split(File::PATH_SEPARATOR)
  .product(suffixes)
  .map{|dir, suffix| File.join(dir, command) + suffix}
  .find{|p| File.executable?(p)}
# => "/bin/ping"

I like this version a little bit better. I think it's more intention-revealing: the reader can immediately see that we were concerned about the case where the key is missing, and took deliberate precautions for that scenario.

And with that, we've reached the end of our journey. We've written a platform-neutral version of the which command for looking up executables in the current PATH. Along the way, we've learned that Ruby provides some basic helpers for writing cross-platform code that works with filenames. As long as we remember to use these helpers, instead of hard-coding our platform assumptions, we stand a much better chance of having our code "just work" no matter where it is run. Happy hacking!

Responses