In Progress
Unit 1, Lesson 1
In Progress

Integer To String

Video transcript & code

Here's a question for you: how do you convert an integer number to a decimal string representation?

Silly question, right? That's something you learn on your first day of writing Ruby code.

You just send the #to_s message to the number.

23.to_s                         # => "23"

That's all there is to it, right?

Well sure, from the perspective of Ruby user. But what does the computer have to do? It can be fun and instructive every now and then to think about how the computer goes about performing the basic operations we take for granted.

We've done one or two episodes on "low-level" stuff like this before. But this time, we won't be looking at the Ruby C code for enlightenment. Instead, we're going to try and think through the problem without any hints.

So, let's start with a number. How about 7.

number = 7

Of course, when the computer is dealing with the number 7, it doesn't think of it in terms of the arabic numeral 7 that we see here.

Just to drive this point home, let's use the binary literal representation for the number instead.

number = 0b0111                 # => 7

I've stuck an extra zero at the front just to make it a nice even 4 bits. That's half of one byte, otherwise known as one "nibble". No, I didn't just make that term up, you can look it up on wikipedia if you don't believe me.

Anyway. How do we translate this integer from the computer's internal representation to a text string? Well, first we have to think about what we mean by a "text string". For today's purposes, we'll define that as a string encoded using the ASCII character set.

In the ASCII character set, the arabic numeral 0 is encoded as the value 48, 1 is encoded as 49, 2 as 50, and so on.

"0".codepoints[0]               # => 48
"1".codepoints[0]               # => 49
"2".codepoints[0]               # => 50

In order to discover the ASCII encoding of the digits, we're asking the strings for their codepoints, which are the numeric encoding values for each character.

So in order to convert from a single-digit number like "7" to an ASCII string representation which can then be displayed, the computer has to map the internal representation of the number to the corresponding ASCII encoding.

Since we only have to do this for 10 individual numerals, we can implement this with a hash mapping numbers to codepoints.

codepoints = {
  0b0000 => 48,
  0b0001 => 49,
  0b0010 => 50,
  0b0011 => 51,
  0b0100 => 52,
  0b0101 => 53,
  0b0110 => 54,
  0b0111 => 55,
  0b1000 => 56,
  0b1001 => 57,
}
# => {0=>48, 1=>49, 2=>50, 3=>51, 4=>52, 5=>53, 6=>54, 7=>55, 8=>56, 9=>57}

Once again, we are using the binary representation of the original integer values in order to emphasize the difference between these internal values and the ASCII encoding needed to construct a printable string.

So in order to get the ASCII-encoded representation of the number 7, we look it up in our codepoints map. We get the codepoint

codepoints = {
  0b0000 => 48,
  0b0001 => 49,
  0b0010 => 50,
  0b0011 => 51,
  0b0100 => 52,
  0b0101 => 53,
  0b0110 => 54,
  0b0111 => 55,
  0b1000 => 56,
  0b1001 => 57,
}
# => {0=>48, 1=>49, 2=>50, 3=>51, 4=>52, 5=>53, 6=>54, 7=>55, 8=>56, 9=>57}
number = 0b0111                 # => 7
codepoints[number]              # => 55

How do we get Ruby to interpret this in a string context instead as a number? For that, we can send the handy #chr message to the number. The result is a single-character string.

This method is short for "character", but I'll pronounce it phonetically as "chr".

codepoints = {
  0b0000 => 48,
  0b0001 => 49,
  0b0010 => 50,
  0b0011 => 51,
  0b0100 => 52,
  0b0101 => 53,
  0b0110 => 54,
  0b0111 => 55,
  0b1000 => 56,
  0b1001 => 57,
}
# => {0=>48, 1=>49, 2=>50, 3=>51, 4=>52, 5=>53, 6=>54, 7=>55, 8=>56, 9=>57}
number = 0b0111                 # => 7
codepoint = codepoints[number]  # => 55
char = codepoint.chr
# => "7"

This code works to convert any of the integers from zero through nine.

codepoints = {
  0b0000 => 48,
  0b0001 => 49,
  0b0010 => 50,
  0b0011 => 51,
  0b0100 => 52,
  0b0101 => 53,
  0b0110 => 54,
  0b0111 => 55,
  0b1000 => 56,
  0b1001 => 57,
}
# => {0=>48, 1=>49, 2=>50, 3=>51, 4=>52, 5=>53, 6=>54, 7=>55, 8=>56, 9=>57}
number = 0b010                 # => 2
codepoint = codepoints[number]  # => 50
char = codepoint.chr
# => "2"

But what if we give it a larger number, like, say, 12?

codepoints = {
  0b0000 => 48,
  0b0001 => 49,
  0b0010 => 50,
  0b0011 => 51,
  0b0100 => 52,
  0b0101 => 53,
  0b0110 => 54,
  0b0111 => 55,
  0b1000 => 56,
  0b1001 => 57,
}
# => {0=>48, 1=>49, 2=>50, 3=>51, 4=>52, 5=>53, 6=>54, 7=>55, 8=>56, 9=>57}
number = 0b1100                 # => 12
codepoint = codepoints[number]  # => nil
char = codepoint.chr
# =>

# ~> NoMethodError
# ~> undefined method `chr' for nil:NilClass
# ~>
# ~> xmptmp-in794099X.rb:16:in `<main>'

This is where things break down. Our map doesn't have an entry for 12, and so there's nothing to send the #chr message to.

We can't very well keep extending our codepoints map up to an infinite number of integers. Maybe if we could break this number into its individual digits, we could then make use of the codepoints map for each one in turn.

How can we do that? Well, let's think back to our basic arithmetic classes. The rightmost digit is the "ones place", and the next digit over is the "tens place". How do we break apart the ones and the tens parts of this number?

To get just the ones, we can divide the number by 10 and take the remainder - which is the "modulo" operation, specified with the percent sign in Ruby.

number = 0b1100                 # => 12
# ...
remainder = number % 10         # => 2

We could map this remainder into the appropriate codepoint, and then convert this codepoint into a string representation.

codepoint = codepoints[remainder] # => 50
result    = codepoint.chr         # => "2"

Next we could isolate the tens place by dividing the number by 10, but this time taking the dividend, instead of the remainder. Once again, we convert this into its ASCII codepoint equivalent.

tens      = number / 10           # => 1
codepoint = codepoints[tens]      # => 49

Now, how to add this new codepoint to our result? To do that, we prepend the character representation of the codepoint onto the result string.

result.prepend(codepoint.chr)     # => "12"

And there's our string representation of the number 12.

Does this code still work for lower numbers, like 7?

codepoints = {
  0b0000 => 48,
  0b0001 => 49,
  0b0010 => 50,
  0b0011 => 51,
  0b0100 => 52,
  0b0101 => 53,
  0b0110 => 54,
  0b0111 => 55,
  0b1000 => 56,
  0b1001 => 57,
}
# => {0=>48, 1=>49, 2=>50, 3=>51, 4=>52, 5=>53, 6=>54, 7=>55, 8=>56, 9=>57}
number    = 0b0111                # => 7
remainder = number % 10           # => 7
codepoint = codepoints[remainder] # => 55
result    = codepoint.chr         # => "7"
tens      = number / 10           # => 0
codepoint = codepoints[tens]      # => 48
result.prepend(codepoint.chr)     # => "07"

Well, sort of. Because this code is now written to expect a number larger than 9, we get an extraneous zero in the tens place.

To fix this, we can introduce a conditional. It will only process the tens place if the number given for conversion is greater than

codepoints = {
  0b0000 => 48,
  0b0001 => 49,
  0b0010 => 50,
  0b0011 => 51,
  0b0100 => 52,
  0b0101 => 53,
  0b0110 => 54,
  0b0111 => 55,
  0b1000 => 56,
  0b1001 => 57,
}
# => {0=>48, 1=>49, 2=>50, 3=>51, 4=>52, 5=>53, 6=>54, 7=>55, 8=>56, 9=>57}
number    = 0b0111                # => 7
remainder = number % 10           # => 7
codepoint = codepoints[remainder] # => 55
result    = codepoint.chr         # => "7"
if number > 9                     # => false
  tens      = number / 10
  codepoint = codepoints[tens]
  result.prepend(codepoint.chr)
end
result                          # => "7"

Now we have our original conversion result for seven back.

OK, so, originally I intended to complete this topic in one episode. But it turned out a lot longer than I expected. So I'm going to cut this off here for now. Tune in next time for the thrilling conclusion. We'll make our solution more general and more elegant, and then talk a little about the process of working through algorithmic problems like this one. Finally, we'll compare our solution to the one in the Ruby source code. Until then: happy hacking!

Responses