In Progress
Unit 1, Lesson 21
In Progress

Leaning Toothpicks

Video transcript & code

Today's topic is super short, and a little bit basic. But if it's new to you, I think you'll find it helpful.

Let's say we're using regular expressions to match against pathnames. For instance, maybe we're looking through files to find all files which reference a particular installed ruby version.

/\/home\/tapas\/.rubies\/ruby-2.2.0/

Because the forward slash is the quoting character for regular expressions, and also the directory separator in UNIX filenames, we've had to escape every slash with a backslash.

This kind of regular expression is ugly, hard to read, and annoying to type. Long ago the Perl community gave a name to regexen that are full of back- and forward-slashes: "leaning toothpick syndrome".

Happily, Ruby gives us a way to avoid leaning toothpicks. Like other types of quoting in Ruby, we can choose to use a character other than a forward slash to quote regexes.

By using the prefix %r, we get to choose what delimiter or set of delimiters will bracket the regex.

For instance, we can choose to use parentheses. And then remove all of our backslashes.

If we evaluate this, we can see that Ruby turns it into the exact same regex we started with.

%r(/home/tapas/.rubies/ruby-2.2.0)
# => /\/home\/tapas\/.rubies\/ruby-2.2.0/

OK, but what if we get fancier with this regular expression? What if we add a character range and a capture group?

You might be surprised to find out that Ruby handles this just fine.

%r(/home/tapas/.rubies/ruby-(2.2.[0-9]+))
# => /\/home\/tapas\/.rubies\/ruby-(2.2.[0-9]+)/

Yes, it's true that parentheses are used both to quote the regex, and to surround a capture group inside the regex. But Ruby is smart enough to parse out which is which.

Still, while Ruby might not be confused, having this mix of meanings might be a strain for human readers. What's a better delimiter to use? Consider that most of the "typical" delimiters already have a special meaning inside a regex: parentheses, square braces, curly braces, even the pipe character has a special meaning in a regex.

Well, how about… the regular old double-quote?

Evaluating this shows that Ruby is totally OK with this.

The %r prefix tells it to interpret the double quotes as regex delimiters, not as string delimiters.

%r"/home/tapas/.rubies/ruby-2.2.0"
# => /\/home\/tapas\/.rubies\/ruby-2.2.0/

As a matter of fact, the sky is pretty much the limit when choosing alternative quoting characters in Ruby. We could delimit our regex with spaces if we wanted to.

%r /home/tapas/.rubies/ruby-2.2.0
# => /\/home\/tapas\/.rubies\/ruby-2.2.0/

Although I hope I don't have to tell you that this is a terrible idea and you should never use it in a serious project.

Anyway, that's all I wanted to show you today. Hopefully this will save you from typing some ugly regexes at some point. Happy hacking!

Responses