Euler #17: Number Letter Counts

If the numbers 1 to 5 are written out in words: one, two, three, four, five, then there are $3+3+5+4+4=19$ letters used in total.
If all the numbers from 1 to 1000 (one thousand) inclusive were written out in words, how many letters would be used?
NOTE: Do not count spaces or hyphens. For example, 342 (three hundred and forty-two) contains 23 letters and 115 (one hundred and fifteen) contains 20 letters. The use of “and” when writing out numbers is in compliance with British usage.
https://projecteuler.net/problem=17

First, we need a function that converts an integer to its written English representation. Let’s start with 1-19. Their English words can be stored as a dictionary called integer_words. Its keys are the digits and the values are English words.

integer_words = {0: "", 1: "one", 2: "two", 3: "three", 4: "four", 
		 5: "five", 6: "six", 7: "seven", 8: "eight", 9: "nine",
		 10: "ten", 11: "eleven", 12: "twelve", 13: "thirteen", 14: "fourteen",
		 15: "fifteen", 16: "sixteen", 17: "seventeen", 18: "eighteen", 19: "nineteen"}

Beyond 19, English representations become more regular. For numbers from 20 to 99, the rule is the tens word plus the ones word, separated by a hyphen. So, let’s add words for 30, 40, …, 90.

integer_words += {20: "twenty", 30: "thirty", 40: "forty", 50: "fifty", 
                  60: "sixty", 70: "seventy", 80: "eighty", 90: "ninety"}

Now we need logic to decide how the number’s English representation should be formed. For 1-19, it’s straightforward – just look it up in the dictionary. For 20-99, the tens digit is multiplied by 10 and used as a key. The ones digit is also a key. The values for these keys are concatenated together with a hyphen.

Once the number exceeds 99, adding the hundreds word is easy. Just take the hundreds digit and use it as a key then concatenate the value with ” hundred”. The tens and ones part are fed back into the function recursively.

def get_english_representation(n):
	"""Returns the English representation of a number, e.g. 35 becomes thirty-five"""

	assert isinstance(n, int)

	if n == 0:
		return ''
	elif n <= 20:
		return integer_words[n]
	elif n <= 99 and n % 10 != 0:
		tens_number = int(str(n)[0]) * 10
		ones_number = int(str(n)[1])
		return integer_words[tens_number] + '-' + integer_words[ones_number]
	elif n <= 99 and n % 10 == 0:
		return integer_words[n]
	elif n <= 999:
		hundreds_digit = int(str(n)[0])
		if n % 100 == 0:
			return integer_words[hundreds_digit] + ' hundred'
		else:
			truncated = int(str(n)[1:])
			return integer_words[hundreds_digit] + ' hundred and ' + get_english_representation(truncated)
	else:
		return 'Not supported'

Now that we have a function that translates a number to its English representation, we can move on to the next sub-problem: counting letters in the representation. It’s not as simple as len(string) because the problem instructs us to exclude spaces and hyphens. That is remedied by subtracting the count of spaces and hyphens.

def count_letters(string):
    """Returns the number of characters in a string, excluding spaces and hyphens"""
    return len(string) - string.count(' ') - string.count('-')

The problem finally can be solved by iterating from 1 to 999, converting to English, counting letters, and incrementing a total.

letters = 0
for i in range(1, 1000):
	letters += count_letters(get_english_representation(i))

Silly me, I forgot that the problem asks for 1-1000 inclusive. Oh well, I just manually counted the letters in “one thousand” and added it to the output. Ssh, don’t tell! The result is $21124$ .

Now, all this is reinventing the wheel. Why not use the work done by others? inflect is an excellent third-party package for dealing with natural language. All of the above code can be reduced to just a few lines.

import inflect
p = inflect.engine()
total = 0
for i in range(1, 1001):
    word = p.number_to_words(i)
    total += len(word) - word.count(' ') - word.count('-')
print(total)