In Progress
Unit 1, Lesson 21
In Progress

Variable

Early in your programming education, someone probably told you that “variables are like boxes for values”. This is a misleading metaphor, and has led to more than its share of developer confusion. In this episode, you’ll learn a different and more appropriate way to think about variables.

Video transcript & code

At some point while you were learning to program, you probably came across a statement like this:

Variables are like boxes. You can put values into them.

For instance, let's say we have a favorite_fruit variable. And we assign it the value of "Apple".

favorite_fruit = "Apple"

It seems like almost every introductory text for programming language includes this metaphor.

And in languages like C and C++, where variables correspond to chunks of memory with a specific size and location, this mental picture of variables as boxes has a certain amount of utility.

Unfortunately, in dynamic languages such as JavaScript, Python, PHP, and Ruby, the idea of variables as boxes is badly misleading.

As an example of how this model breaks down, let's say we assign the value of favorite_fruit to a second variable, snack.

The model would suggest that after this assignment, the favorite_fruit variable should now be "empty".

In fact, what we find is that favorite_fruit still retains its original value…

…a value which it now shares with and snack.

favorite_fruit = "Apple"
snack    = favorite_fruit

favorite_fruit # => "Apple"
snack    # => "Apple"

Of course, the computer doesn't think in terms of physical boxes. And we know this. But we also know that a programming language attempts to enforce a certain model of processing. And the value of a metaphor, such as these boxes, is in how well it enables us to predict the way they language will behave in a given situation.

Already, this box metaphor is not looking very well-adapted to predicting the behavior of a high-level dynamic programming language.

Let's try out a different way of thinking of variables.

When we assign a value to a variable, we imagine that we are placing a sticky note on the value.

favorite_fruit = "Apple"

That sticky note enables us to keep track of that specific value, even among many other similar values.

When we assign that variable to another, we are just putting a second sticky note on the same object.

favorite_fruit = "Apple"
snack    = favorite_fruit

This matches up to the behavior we see in the code, where both variables now reference the same object.

favorite_fruit = "Apple"
snack    = favorite_fruit

favorite_fruit # => "Apple"
snack    # => "Apple"

Here's another example of where the "box" model leads to incorrect assumptions.

Let's say we assign an array to a variable.

fresh_eggs = ["egg1", "egg2", "egg3", "egg4"]

Next, we assign the array to a second variable.

fresh_eggs = ["egg1", "egg2", "egg3", "egg4"]
breakfast = fresh_eggs

After which, we add two new values to the array referenced by the first variable, using the concat method we learned about in Episode #79.

fresh_eggs = ["egg1", "egg2", "egg3", "egg4"]
breakfast = fresh_eggs
fresh_eggs.concat(["egg5", "egg6"])

Here, the box metaphor suggests that we should now have one variable containing 6 objects, and a second variable containing 4.

In fact, what we find is that the second variable now also contains 6 values.

fresh_eggs = ["egg1", "egg2", "egg3", "egg4"]
breakfast = fresh_eggs
fresh_eggs.concat(["egg5", "egg6"])
breakfast
# => ["egg1", "egg2", "egg3", "egg4", "egg5", "egg6"]

This illustrates another of the big conceptual problems with thinking of variables as boxes: it makes it far to easy to confuse variables with container objects, such as arrays.

Let's re-do this example, this time using the sticky note metaphor.

Once again, we assign an array to a variable.

fresh_eggs = ["egg1", "egg2", "egg3", "egg4"]

We assign the array to a second variable.

fresh_eggs = ["egg1", "egg2", "egg3", "egg4"]
breakfast = fresh_eggs

Then we add our two new values.

fresh_eggs = ["egg1", "egg2", "egg3", "egg4"]
breakfast = fresh_eggs
fresh_eggs.concat(["egg5", "egg6"])

This time around, it is quite obvious that both the first and second variables both refer to the same array object - an object which now contains six values.

fresh_eggs = ["egg1", "egg2", "egg3", "egg4"]
breakfast = fresh_eggs
fresh_eggs.concat(["egg5", "egg6"])
breakfast
# => ["egg1", "egg2", "egg3", "egg4", "egg5", "egg6"]

In this version, it's very clear that the array is a container, and the variables function as labels for that container.

On any given day of the week, you can visit programming forums for a dynamic language, and find people posting problems that obviously stem from an inaccurate mental model of variable semantics. In most cases, it's because they're looking at variables as if they are boxes to put things in.

Like I said before, the computer doesn't care about any of the stuff. All it wants to do is crunch numbers. So it's not about which model is "correct". It's about finding models that are effective.

The quality of our mental models determines whether the code we write does what we expect it to do. And that translates directly into avoiding bugs and meeting deadlines. The more effective our models, the more work we're able to get done.

Happy hacking!

Responses