In Progress
Unit 1, Lesson 1
In Progress

String Format

Video transcript & code

We've had some impressive variety in our weather recently. Here's a list of high temperatures in Fahrenheit for the last seven days. As you can see, it has gone from short-sleeve weather down to freezing twice in the space of a week.

Let's say we want to incorporate the average of these temperature readings in a string. Once we calculate the average we can just interpolate it in.

HIGHS = [
  69.0,
  63.0,
  30.0,
  54.0,
  61.0,
  39.0,
  32.0
]

avg = HIGHS.reduce(0, :+) / HIGHS.size
"Average high: #{avg}"
# => "Average high: 49.714285714285715"

But this isn't very pretty. Most of the time string interpolation is the only tool we need when we are dynamically consutructing strings. But when it comes to floating point numbers, we may want a little more control over the formatting of the numbers. That's where string formats enter the picture.

To use a string format, instead of directly interpolating the value we insert a special placeholder code. The placeholder always starts with a % sign. In this case, we'll use the code %.1f. After the end of the string, we put the string format operator which, again, is the % sign. We follow the operator with an array of arguments. In this case, our only argument is the average high temperature.

require "./data"
avg = HIGHS.reduce(0, :+) / HIGHS.size
avg                             # => 49.714285714285715
"Average high: %.1f" % [avg]
# => "Average high: 49.7"

As we can see, the string is now nicely formatted with a single digit after the decimal point.

We'll get into the details of the format code in a moment. But first, let's clarify how multiple arguments are dealt with by adding a second argument. We'll add an average of low temperatures to the list of arguments.

require "./data"
avg_hi = HIGHS.reduce(0, :+) / HIGHS.size
avg_lo = LOWS.reduce(0, :+) / LOWS.size
"Average high: %.1f, low: %.1f" % [avg_hi, avg_lo]
# => "Average high: 49.7, low: 25.4"

What this demonstrates is that the arguments passed to the string formatting operator are used in order to replace formatting codes found in the string.

There's an alternative to using the % operator: we can use the format function provided by the Kernel module. This method is also aliased as sprintf.

require "./data"
avg_hi = HIGHS.reduce(0, :+) / HIGHS.size
avg_lo = LOWS.reduce(0, :+) / LOWS.size
format "Average high: %.1f, low: %.1f", avg_hi, avg_lo
# => "Average high: 49.7, low: 25.4"
sprintf "Average high: %.1f, low: %.1f", avg_hi, avg_lo
# => "Average high: 49.7, low: 25.4"

I present these forms mainly for completeness; personally, I always use the % operator version.

Now, about those formatting strings. If you come from a C programming background, these codes are probably pretty familiar to you already, since they are a superset of the codes that printf supports. But since not everyone has C experience, we're going to go over the codes in detail. Even if you do already know some of these codes, there are still some Ruby-specific surprises you may not have been aware of.

Let's start with the simplest possible formatting code. If we use the code %s, we can interpolate in a string.

"hello, %s" % ["world"]
# => "hello, world"

Of course we could use ordinary string interpolation for this. More often, we use string formats for special numeric formatting. To add a decimal number to our string, we use the %d format string. To print the number in hexadecimal, we use %x. For octal, the code is %b. And for binary, we use %b.

"Magic number: %d" % [23]
# => "Magic number: 23"
"Magic number: %x" % [23]
# => "Magic number: 17"
"Magic number: %o" % [23]
# => "Magic number: 27"
"Magic number: %b" % [23]
# => "Magic number: 10111"

All of these codes can be modified with a # symbol. This symbol is sort of a generic flag for many different formatting codes. Wherever it appears it means to use an "alternative" format, where the meaning of alternative varies depending on the code being modified. In the case of the hexadecimal, octal, and binary codes it causes the appropriate conventional prefix to be added to the formatted number: "0x" for hexadecimal, "0" for octal, and "0b" for binary.

"Magic number: %#x" % [23]
# => "Magic number: 0x17"
"Magic number: %#o" % [23]
# => "Magic number: 027"
"Magic number: %#b" % [23]
# => "Magic number: 0b10111"

We can switch the prefixes to capital letters by using a capital X or B on the hex and binary format codes.

"Magic number: %#X" % [23]
# => "Magic number: 0X17"
"Magic number: %#B" % [23]
# => "Magic number: 0B10111"

We've seen the %f notation before. Without any special modifiers, it just interpolates in a numeric value, treating it as a floating point number. Here, we use the value of Pi.

"Pi: %f" % [Math::PI]
# => "Pi: 3.141593"

In some cases where we are dealing with very large or very small numbers, may want to use exponential notation. For instance, the distance in kilometers to the Andromeda Galaxy has quite a lot of zeroes after it. By using the %e formatting code, we can cause it to be printed in exponential notation.

dist = 2.4e19
"Distance to Andromeda (km): %d" % [dist]
# => "Distance to Andromeda (km): 24000000000000000000"
"Distance to Andromeda (km): %e" % [dist]
# => "Distance to Andromeda (km): 2.400000e+19"

Sometimes we may want to be flexible and only switch to exponential notation when the number in question warrants it. The %g code is good for this case. Here's an example where we start with a relatively small number, and then switch to a much larger one.

dist = 123.0
"Distance in km: %g" % [dist]
# => "Distance in km: 123"
dist = 2.4e19
"Distance in km: %g" % [dist]
# => "Distance in km: 2.4e+19"

There are a couple more codes worth noting: first, %p will fill in the result of sending #inspect to an argument.

"object: %p" % [{foo: 42}]
# => "object: {:foo=>42}"

And if we just want a raw percent sign, we can use two percent signs in a row.

"In favor: %d%%" % [56]   # => "In favor: 56%"

We've now covered the majority of the formatting codes. But the power of format strings lies not just in the variety of codes, but in the various flags that can be used to modify those codes.

One of the most common flags is field width. For instance, consider the following code. It prints out minimum and maximum temperatures for the preceding week. The output isn't very readable, however.

require "./data"
DAYS.each_with_index do |day, i|
  puts "%s High: %d Low: %d" % [day, HIGHS[i], LOWS[i]]
end
# >> Tuesday High: 69 Low: 34
# >> Wednesday High: 63 Low: 28
# >> Thursday High: 30 Low: 18
# >> Friday High: 54 Low: 19
# >> Saturday High: 61 Low: 30
# >> Sunday High: 39 Low: 28
# >> Today High: 32 Low: 21

Now lets apply some field widths to these formatting codes. We give the "day" field a 9 character width.

require "./data"
DAYS.each_with_index do |day, i|
  puts "%9s High: %d Low: %d" % [day, HIGHS[i], LOWS[i]]
end
# >>   Tuesday High: 69 Low: 34
# >> Wednesday High: 63 Low: 28
# >>  Thursday High: 30 Low: 18
# >>    Friday High: 54 Low: 19
# >>  Saturday High: 61 Low: 30
# >>    Sunday High: 39 Low: 28
# >>     Today High: 32 Low: 21

This is a lot more structured, but the alignment of the day column is a little weird. That's because by default, values will be right-aligned within their field width, and the remainder will be filled in with spaces. We can switch a field to being left-aligned by adding a minus sign before the field width flag.

require "./data"
DAYS.each_with_index do |day, i|
  puts "%-9s High: %d Low: %d" % [day, HIGHS[i], LOWS[i]]
end
# >> Tuesday   High: 69 Low: 34
# >> Wednesday High: 63 Low: 28
# >> Thursday  High: 30 Low: 18
# >> Friday    High: 54 Low: 19
# >> Saturday  High: 61 Low: 30
# >> Sunday    High: 39 Low: 28
# >> Today     High: 32 Low: 21

By the way, there is a shortcut for formatting a string and immediately printing it. Similar to #puts, there is a Kernel#printf method which takes a format string followed by value arguments. Note that unlike #puts, with #printf we have to explicitly supply a newline at the end of the format string.

require "./data"
DAYS.each_with_index do |day, i|
  printf "%-9s High: %d Low: %d\n", day, HIGHS[i], LOWS[i]
end
# >> Tuesday   High: 69 Low: 34
# >> Wednesday High: 63 Low: 28
# >> Thursday  High: 30 Low: 18
# >> Friday    High: 54 Low: 19
# >> Saturday  High: 61 Low: 30
# >> Sunday    High: 39 Low: 28
# >> Today     High: 32 Low: 21

So far these temperatures are all exactly two digits in length. Let's change that by converting them to Celsius, using a helper method I put in the oven during the commercial break.

require "./data"
def C(t)
  ((t - 32.0) * 5.0) / 9.0
end
DAYS.each_with_index do |day, i|
  puts "%-9s High: %d Low: %d" % [day, C(HIGHS[i]), C(LOWS[i])]
end
# >> Tuesday   High: 20 Low: 1
# >> Wednesday High: 17 Low: -2
# >> Thursday  High: -1 Low: -7
# >> Friday    High: 12 Low: -7
# >> Saturday  High: 16 Low: -1
# >> Sunday    High: 3 Low: -2
# >> Today     High: 0 Low: -6

This throws off our nice tabular layout. We can get the structure back by specifying a field width which is large enough for the longest value.

require "./data"
def C(t)
  ((t - 32.0) * 5.0) / 9.0
end
DAYS.each_with_index do |day, i|
  puts "%-9s High: %3d Low: %3d" % [day, C(HIGHS[i]), C(LOWS[i])]
end
# >> Tuesday   High:  20 Low:   1
# >> Wednesday High:  17 Low:  -2
# >> Thursday  High:  -1 Low:  -7
# >> Friday    High:  12 Low:  -7
# >> Saturday  High:  16 Low:  -1
# >> Sunday    High:   3 Low:  -2
# >> Today     High:   0 Low:  -6

While we're messing with the format, let's switch back to showing temperatures as floating point numbers.

require "./data"
def C(t)
  ((t - 32.0) * 5.0) / 9.0
end
DAYS.each_with_index do |day, i|
  puts "%-9s High: %3f Low: %3f" % [day, C(HIGHS[i]), C(LOWS[i])]
end
# >> Tuesday   High: 20.555556 Low: 1.111111
# >> Wednesday High: 17.222222 Low: -2.222222
# >> Thursday  High: -1.111111 Low: -7.777778
# >> Friday    High: 12.222222 Low: -7.222222
# >> Saturday  High: 16.111111 Low: -1.111111
# >> Sunday    High: 3.888889 Low: -2.222222
# >> Today     High: 0.000000 Low: -6.111111

That has thrown things off again. Let's get the number of decimal places under control. By adding a period and another number after the field width, we can specify the decimal precision to be shown.

require "./data"
DAYS.each_with_index do |day, i|
  puts "%-9s High: %3.1f Low: %3.1f" % [day, C(HIGHS[i]), C(LOWS[i])]
end
# >> Tuesday   High: 20.6 Low: 1.1
# >> Wednesday High: 17.2 Low: -2.2
# >> Thursday  High: -1.1 Low: -7.8
# >> Friday    High: 12.2 Low: -7.2
# >> Saturday  High: 16.1 Low: -1.1
# >> Sunday    High: 3.9 Low: -2.2
# >> Today     High: 0.0 Low: -6.1

Now we need to increase the overall field width to accommodate the longer values.

require "./data"
DAYS.each_with_index do |day, i|
  puts "%-9s High: %4.1f Low: %4.1f" % [day, C(HIGHS[i]), C(LOWS[i])]
end
# >> Tuesday   High: 20.6 Low:  1.1
# >> Wednesday High: 17.2 Low: -2.2
# >> Thursday  High: -1.1 Low: -7.8
# >> Friday    High: 12.2 Low: -7.2
# >> Saturday  High: 16.1 Low: -1.1
# >> Sunday    High:  3.9 Low: -2.2
# >> Today     High:  0.0 Low: -6.1

At this point we've covered the basics of string formats in Ruby. We now know how to print out numbers in various formats, while controlling field sizes and precision. There's a lot more to learn about string formats, but we'll leave that for another day. Happy hacking!

Responses