patternpythonModerate

Converting from Roman numerals to Arabic numbers

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

arabicromannumbersnumeralsconvertingfrom

Problem

I got an idea to make a Roman/Arabic number converter, and I would like to hear your thoughts, where I could improve it, if there are Python specific things that I missed, so on and so forth.

# Decodes roman numerals
def get_arabic_numbers(t):
    # Hold the numbers in the list
    l = list()

    # The format
    format = 0

    # Convert from roman numeral to arabic number
    for i in range(0, len(t)):
        if t[i] == 'I':
            l.append(int(1))
        elif t[i] == 'V':
            l.append(int(5))
        elif t[i] == 'X':
            l.append(int(10))
        elif t[i] == 'L':
            l.append(int(50))
        elif t[i] == 'C':
            l.append(int(100))
        elif t[i] == 'D':
            l.append(int(500))
        elif t[i] == 'M':
            l.append(int(1000))

    # Calculate the format as follows:
    i = 0
    while i < len(l) - 1:
        # If the next numeral is greater than the current one, add the next one minus the current one = format = format + next_numeral - current_numeral
        if l[i] < l[i+1]:
            format = format + l[i+1] - l[i];
            i = i + 2
        # Else add it normally
        else:
            format = format + l[i]
            i = i + 1

    # Fix in case the last two numerals are equal
    if i == len(l):
        if l[i-2] == l[i-1]:
            format = format + l[i-1]
    else:
        format = format + l[i]

    # Return the format
    return format

```
# Encode an arabic digit to a roman numeral
def encode(l, c, one, five, nine):
# The format to be returned
format = ""

# Special case 5.1: 5 has its own numeral 'V'
# Special case 5.2: 50 has its own numeral 'A'
# Special case 5.3: 500 has its own numeral 'D'
if l[c] == 5:
format = format + five
elif l[c] 5:
# Special case 9.1: 9 depends on 10, so it's 'IX'
# Special case 9.2: 90 depends on 1000, so it's 'XC'
# Special case 9.3: 900 depends on 1000, so it's 'CM'
if l[c] == 9:

Solution

Impressions

That's a very long solution. The comments were generally helpful, but the code itself could have been easier to understand if it were shorter and more expressive.

The variable names were generally not helpful: l, i, t, c, li were all cryptic.

get_arabic_numbers(t) is not really named appropriately:

"get" implies retrieving something that already exists. This is more of a calculation.

Why is "numbers" plural?

Why is the parameter named t — what does it stand for?

Strictly speaking, the result is an int. It's not Arabic, nor is it even in base ten. It's just an abstract integer, which is conventionally rendered for you as base-ten Arabic digits when str() is called on it (whether explicitly or implicitly). (And by "Arabic", we mean 0123456789, not ٠١٢٣٤٥٦٧٨٩.)

Based on these concerns, I'd call it decode_roman_numeral(roman). It happens to be what you wrote as the comment! Similarly, I'd call the inverse function encode_roman_numeral(num).

Arabic → Roman

# The format
format = 0

That's not what I would call a "format" — I would call it a "result".

# Hold the numbers in the list
l = list()
…

# Convert from roman numeral to arabic number
for i in range(0, len(t)):
    if t[i] == 'I':
        l.append(int(1))
    elif t[i] == 'V':
        l.append(int(5))
    elif t[i] == 'X':
        l.append(int(10))
    elif t[i] == 'L':
        l.append(int(50))
    elif t[i] == 'C':
        l.append(int(100))
    elif t[i] == 'D':
        l.append(int(500))
    elif t[i] == 'M':
        l.append(int(1000))

As I mentioned before, l is a cryptic name. There is no need to convert 1 to int(1). Lookups are typically done in Python using a dictionary. Also, whenever you populate a list by creating an empty list and appending to it repeatedly, that is a good candidate for a list comprehension:

trans = {'I': 1, 'V': 5, 'X': 10, 'L': 50, 'C': 100, 'D': 500, 'M': 1000}
values = [trans[r] for r in roman]

# Calculate the format as follows:
i = 0
while i < len(l) - 1:
    # If the next numeral is greater than the current one, add the next one minus the current one = format = format + next_numeral - current_numeral
    if l[i] < l[i+1]:
        format = format + l[i+1] - l[i];
        i = i + 2
    # Else add it normally
    else:
        format = format + l[i]
        i = i + 1

Do you really need a case that advances i by two? How about this instead:

result = 0
for i in range(0, len(l) - 1):
    if l[i] < l[i+1]:
        result -= l[i]
    else:
        result += l[i]

As I mentioned before, the code could be more expressive. This loop is summing values in a list, so we should use the sum() built-in function to say what we mean.

result = sum(
    val if val >= next_val else -val
    for val, next_val in zip(values[:-1], values[1:])
)

All together, then, I'd condense your function down to this:

def decode_roman_numeral(roman):
    """Calculate the numeric value of a Roman numeral (in capital letters)"""
    trans = {'I': 1, 'V': 5, 'X': 10, 'L': 50, 'C': 100, 'D': 500, 'M': 1000}
    values = [trans[r] for r in roman]
    return sum(
        val if val >= next_val else -val
        for val, next_val in zip(values[:-1], values[1:])
    ) + values[-1]

Arabic → Roman

The encode() function encodes a single Arabic digit, so encode_digit() would be a better name. There is no point in passing l and c separately, since you only ever care about l[c]. Instead of looping to append characters to format, use the * operator to construct a repeated string.

I would write this helper function using a single expression with four cases:

def encode_digit(digit, one, five, nine):
    return (
        nine                     if digit == 9 else
        five + one * (digit - 5) if digit >= 5 else
        one + five               if digit == 4 else
        one * digit              
    )

The get_roman_numerals() function is rather convoluted. I had a hard time following the code, so I went by your comments instead. In the end, I figured out that it was essentially doing this:

def encode_roman_numeral(num):
    num = int(num)
    return (
        'M' * (num // 1000) +
        encode_digit((num // 100) % 10, 'C', 'D', 'CM') +
        encode_digit((num //  10) % 10, 'X', 'L', 'XC') +
        encode_digit( num         % 10, 'I', 'V', 'IX') 
    )

Code Snippets

# The format
format = 0

# Hold the numbers in the list
l = list()
…

# Convert from roman numeral to arabic number
for i in range(0, len(t)):
    if t[i] == 'I':
        l.append(int(1))
    elif t[i] == 'V':
        l.append(int(5))
    elif t[i] == 'X':
        l.append(int(10))
    elif t[i] == 'L':
        l.append(int(50))
    elif t[i] == 'C':
        l.append(int(100))
    elif t[i] == 'D':
        l.append(int(500))
    elif t[i] == 'M':
        l.append(int(1000))

trans = {'I': 1, 'V': 5, 'X': 10, 'L': 50, 'C': 100, 'D': 500, 'M': 1000}
values = [trans[r] for r in roman]

# Calculate the format as follows:
i = 0
while i < len(l) - 1:
    # If the next numeral is greater than the current one, add the next one minus the current one = format = format + next_numeral - current_numeral
    if l[i] < l[i+1]:
        format = format + l[i+1] - l[i];
        i = i + 2
    # Else add it normally
    else:
        format = format + l[i]
        i = i + 1

result = 0
for i in range(0, len(l) - 1):
    if l[i] < l[i+1]:
        result -= l[i]
    else:
        result += l[i]

Context

StackExchange Code Review Q#141402, answer score: 13

Revisions (0)

No revisions yet.