In Progress
Unit 1, Lesson 21
In Progress

Temporary Directory

Video transcript & code

A while back we talked about how to manage temporary files in Ruby. But sometimes we need more than a temporary file. We need a temporary directory in which to do some work.

For instance, let's say we're working on generating e-book files. The kindlegen utility from Amazon is used to generate Amazon Kindle-compatible e-books from various other formats. In order to use it, we give it the name of a source file to convert. It then creates a new file in the same directory, with the same name as the source file, but with a .mobi filename extension.

$ ls
book.html

$ kindlegen book.html
...
$ ls
book.html
book.mobi

Let's say we're trying to slot this process into a larger e-book production toolchain. We want to be able to specify the path of both the source file, and the path of the destination file. But that's not an option that's available to us with kindlegen.

def generate_kindle(source_path, dest_path)
  # ...
end

So instead, we decide on a multi-step process. We'll create a temporary directory. We'll copy the source book into it. Then we'll change directory into the temporary directory, and run kindlegen. We'll copy the resulting output file to the destination path. Finally, we'll tear down the temporary directory.

require "fileutils"
def generate_kindle(source_path, dest_path)
  FileUtils.mkpath "tmp"
  FileUtils.cp source_path, "tmp/book.html"
  Dir.chdir "tmp" do
    system("kindlegen book.html")
  end
  FileUtils.cp "tmp/book.mobi", dest_path
  FileUtils.rmtree "tmp"
end

This works well enough. But it's a bit messy and fragile. Right now, it's going to try and create its temporary directory wherever it happens to be. This could be a problem for several reasons. For instance:

  • It may clash with an existing, unrelated temporary directory. Then when it removes the directory at the end, it might be destroying some other program's files.
  • If the method fails at some point, and doesn't clean up after itself, it's going to leave a "tmp" directory cluttering up the filesystem.
  • If this does happen, it's not clear from the name what program was responsible for creating this directory.
  • If we start parallelizing our e-book generation, running multiple processes at once, by sharing a single temp directory we could very easily step on the work that other processes are doing, resulting in undefined behavior.
  • Exploiting race conditions that occur when temporary files are created is a common vector for security compromises.
  • Finally, on some servers, only certain parts of the filesystem are writeable. The rest is read-only. Our code might crash on a system like this, when it tries to create a new temporary directory and encounters an exception.

For all of these reasons and probably others, it's rarely a good idea to manually generate temporary files and directories. Especially when there are tools available to us that can manage the process for us. And in Ruby we do in fact have a tool to take the headaches out of managing temporary directories.

Let's require the tmpdir library. Rather than adding any new classes or modules to the Ruby namespace, this library augments the core Dir class with some new capabilities.

First off, we can now send it the tmpdir message, and it will tell us the path of the system default temporary directory.

require "tmpdir"
Dir.tmpdir                      # => "/tmp"

This method hides the complexity of figuring out where the correct place is to put temporary files. This alone is pretty useful, since conventions for tempfile locations vary across operating systems and individual systems. It's especially important to know on those systems where user and library directories are read-only, and only the systemwide temporary directory is available for writing.

But the tmpdir library gives us more than this. We can also ask it to create a temporary directory for us. The method returns the name of the created directory. As you can see, the name is highly unique. It includes the date the directory was created, the process ID of Ruby program that created it, and a random string. This makes it virtually impossible for two invocations of a program to accidentally share the same temporary directory.

If we check the directory's permissions, we can see that it has been created 0700. Which on my Linux system means that it is only accessible to the user which created it, not to any other groups or users.

require "tmpdir"
require "English"
$PID                              # => 10821
path = Dir.mktmpdir               # => "/tmp/d20150506-10821-1m5iim5"
File::Stat.new(path).mode.to_s(8) # => "40700"

While this method is clearly doing a good job of generating unique directory names, the directory names aren't very self-describing. When cleaning out a system temporary directory, it's always a little frustrating to find "mystery directories" with no obvious origin.

To make the generated directories a little more identifiable, we can pass a prefix to the method. This will be incorporated into the final directory name, without compromising any of the unique identifiers.

require "tmpdir"
path = Dir.mktmpdir("myprog")               # => "/tmp/myprog20150506-10891-1...

So now we have a way to create secure, uniquely-named temp directories in the correct system temp location. But we still have to manually clean them up when we are done with them.

However, there's another way to use this method. If we pass it a block, then the directory will only exist for the lifetime of the block. Once the block exits, it will be cleaned up. The block receives the temp directory's full path as its argument.

require "tmpdir"

dir = nil
Dir.mktmpdir do |d|
  dir = d                       # => "/tmp/d20150506-11069-9rde89"
  File.exist?(d)                # => true
end
File.exist?(dir)                # => false

We now have knowledge in hand that will let us substantially improve our original Kindle generation method. Instead of our own half-baked temporary directory, we can use Ruby's tmpdir library to create a temp directory, use it, and then automatically clean it up.

require "tmpdir"
require "fileutils"
def generate_kindle(source_path, dest_path)
  Dir.mktmpdir("generate_kindle") do |d|
    FileUtils.cp source_path, "#{d}/book.html"
    Dir.chdir(d) do
      system("kindlegen book.html")
    end
    FileUtils.cp "#{d}/book.mobi", dest_path
  end
end

With the tmpdir library and the mktmpdir method, you'll probably never need to manually create a temporary directory again. Happy hacking!

Responses