In Progress
Unit 1, Lesson 1
In Progress

Interpolation over Concatenation

Recently Jessica Kerr and I had a conversation about a Ruby idiom that I take for granted anymore: the preference for string interpolation over other methods of building up text, such as concatenation. Here, with some visual aids, is a reenactment of our discussion of four different reasons Ruby programmers lean heavily on String interpolation.

Video transcript & code

J: The other day you showed some code like this for building a string from some smaller strings and an integer.

"#{prefix}#{shotnum += 1}#{postfix}"

A: Right, that was the episode about text filtering with the -p command-line flag.

J: No me gusta. I don’t like this style.

A: OK, what don’t you like about it?

J: Look at all those curly braces and hashes. They're pokey on my eyes.

What's wrong with good old string concatenation?

prefix + (shotnum += 1) + postfix

A: Ah, yeah. I’m glad you pointed this out, because it’s one of those little idiomatic speedbumps when coming to Ruby. When I use other languages, like JavaScript, PHP, Python, or Java I’m much more likely to build up strings with string concatenation operators.

In Ruby it’s idiomatic to use string interpolation to build strings. And there are good reasons for this. In fact, I can think of four reasons.

J: Four, wow. OK, what’s the first reason?

The first reason is that this code just straight-up won’t work in Ruby.

J: Wat.

A: Well, check it out. It complains that the shot number expression contains a number, and it doesn’t know how to convert that to a string.

We have to write it like this, with an explicit conversion:

prefix + (shotnum += 1).to_s + posttfix

J: (blinks) you ... have to ... do ... an explicit conversion ...?

A: That seems pretty weird, right? I mean, Ruby is the language that makes a lot of guesses to try to do what the programmer means. But this is one area where Matz deliberately decided to make Ruby less “clever” than other languages. He designed Ruby to almost never perform type coercion or conversion unless you ask for it. And I think he made the right choice!

J: Do you now.

A: OK so do you remember Gary Bernhardt’s “Wat” talk?

/* WAT */

[] + [] // ''
[] + {} // '[object Object]'
{} + [] // 0
{} + {} // NaN
Array(16).join('wat' - 1) + ' Batman' // 'NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN Batman'

J: Do I remember it. I was there, dude! I was in the audience. You can hear me say "wat" back at him. That was the day he showed the world the amazing nonsense that JavaScript (and Ruby) are capable of.

A: Yeah, that one! A lot of the stuff he shows in that video are related to “coercion”: when programming languages have rules for converting one type to another, based on context.

J: Coercion is hard to get right.

A: I’d say it’s not possible to get right. No matter what rules you define, they are going to surprise programmers sometimes.

J: bad surprises.

A: And this is especially true in dynamically-typed languages like JavaScript and Ruby. They don’t have as much information to help them make educated guesses.

And I think this is something Matz realized when he was designing Ruby: dynamic typing and implicit conversions are two design approaches which are at odds with each other. When you combine them, you get surprises like these JavaScript examples from Gary’s Wat talk.

J: So Ruby chooses not to do implicit conversion between types.

A: Right. Which means that when we’re appending arbitrary expressions to strings, we have to remember to add explicit type conversion.

prefix + (shotnum.to_s += 1).to_s + postfix

J: But you don’t have to do that when you use interpolation?

A: Correct! Because when we use string interpolation, we’re already explicitly asking Ruby to perform string conversion. It’s obvious from the context that we want these things to turn into strings. So this is one of the rare contexts where Ruby does conversion for us… but only into strings.

"#{prefix}#{shotnum += 1}#{postfix}"

J: OK so reason 1: we get conversion to string for free, without ambiguity. What’s the next reason?

A: The next reason is historical. Back when Ruby came out, string interpolation wasn’t as common in programming languages. And even in languages where it existed, it was often in a very limited form.

So for instance, we might have been able to interpolate in a variable like this.

"<!-- shot($shotnum) -->"

But we couldn’t interpolate in arbitrary expressions.

"<!-- shot(${new_shotnum = shotnum + 1}) -->" // nope

Instead we’d have to pull those expressions out into a separate line.

new_shotnum = shotnum + 1;
"<!-- shot($new_shotnum) -->"

And so a lot of the time it was just more convenient to stick to string concatenation,

because you could easily add an arbitrary expression into a line of string concatenation.

"<!-- shot(" + (shotnum + 1) + ") -->"

J: Ohhh, so variable interpolation is more common than general expression interpolation? But general interpolated strings is more common now. Like in TypeScript (and recent JavaScript), and newer languages.

A: Right. But that’s been a gradual process. Whereas right from the start, Ruby not only made it possible to interpolate arbitrary expressions into strings… it made the novel design decision to have only generalized expression interpolation.

So when we interpolate a variable into a string in Ruby, we’re really interpolating an expression. That just happens to consist of only one variable.

"<!-- shot(#{shotnum}) -->" 

Which means that if we decide we want to do more in that expression, we can, without making any change to the way it’s escaped.

"<!-- shot(#{shotnum += 1}) -->" 

J: It’s like, when the string gets a little more complicated, there’s no friction to it.

A: Right, exactly. So Ruby programmers started using it everywhere.

J: Ruby likes reducing my friction, that’s for sure. So reason 2: because it’s always been easy. What’s #3?

A: Well, let’s start with this same code.

"<!-- shot(#{shotnum += 1}) -->" 

Would you prefer that we rewrite this as concatenation?

J: No, it makes sense to me as interpolation. We have a small variable part of the string in a larger fixed part.

A: OK. But what if we then decide that the parts that come before and after the shot number need to vary as well?

prefix = "<!-- shot("
postfix = ") -->"
"#{prefix}#{shotnum += 1}#{postfix}" 

J: That’s the code we started with.

A: Right, it’s just that this time we evolved to it. Do you still want to change it to concatenation, now that it just so happens that it no longer contains any fixed string elements?

J: Well… it’s still ugly. But I wouldn’t convert it to using concatenation.

prefix + (shotnum += 1).to_s + postfix 

There’s no meaningful reason to. It’s going to introduce a much bigger diff into the git history.

A: Right. And this gets into the problem of “difference noise”. Any time we introduce a style change like this, either from the previous git revision or from the way other strings are built in the same codebase, it forces readers of the code to stop and think. They have to try to understand the significance of the change: “What’s different and special about this particular string construction? Why did someone write it differently??”

J: Yeah and there’s nothing different or special. It’s just that we happened to remove all the fixed parts of the string… for now. We might add more letters later on, and then we would change it back? nah.

Reason 3: reduce diff noise in back and forth. If you’d interpolate sometimes, interpolate all of the times.

A: Yep. And that brings me to the last reason.

With the concatenation operators, when I read the code for the first time, I expend mental energy to understand whether this is numeric arithmetic or string concatenation. I have to look at variable names, method context, and the presence of conversion calls like to_s.

prefix + (shotnum += 1).to_s + postfix

But with interpolation, there’s no question. It’s instantly obvious that a string is being built.

"#{prefix}#{shotnum += 1}#{postfix}"

J: Reason 4, You can tell at a glance that we’re making a string, and not a number? That helps in a language that doesn’t declare return types.

A: yep. I love how building strings with interpolation communicates to both the compiler and the reader. By using quotes and those funky interpolation hashes and curlies, we tell Ruby that we’re building a string so we would like some string conversions, please. And we also tell any human reading the code that this is not numeric arithmetic. This is a string being built.

J: So interpolation is idiomatic because: it’s easier; it’s always been easier; it’s especially easier in strings with some fixed content, so do it all the time dangit; and it says more to the reader than plus.

A: That’s right!

J: But why does it have to use hashes AND curlies? Why must it be so poky?

A: I’m not sure exactly what logic went into choosing that syntax. But I will say this for it: I haven’t run into a lot of situations where it was ambiguous or clashed somehow with the text I was trying to construct. I think maybe Matz just chose a syntax that was distinctively weird enough to make conflicts rare.

Whatever the reason… yeah, it’s a little funky. But you get used to it after a while!

J: Ugh fine I guess I can get used to it too.

Is this the part where we say “happy hacking?”

A: I think it is.

Both: Happy hacking!

Responses