patternpythonMajor
Checking equality of integers: O(1) in C but O(log n) in Python 3?
Viewed 0 times
equalitylogbutcheckingpythonintegers
Problem
Consider these equivalent functions in C and Python 3. Most devs would immediately claim both are $O(1)$.
But consider what is happening under the surface. Integers are just binary strings and, to determine equality, both languages will compare the strings bit-by-bit. In either case this scan is $O(b)$ where $b$ is the number of bits. Since integers have a constant size in bits in C, this is simply $O(1)$.
EDIT: C doesn't compare bit-by-bit see this answer
In Python 3 however, integers do not have fixed size and the scan remains $O(b)$ for the number of bits in the input, or $O(\log a)$ where $a$ is the value of the input in base 10.
So if you're analyzing code in Python, any time you compare two integers, you are embarking on a surprisingly complex journey of $O(\log n)$ with respect to the base 10 value of either number.
For me this raises several questions:
EDIT: It is easily verified (and intuitive) that Python cannot compare arbitrarily large ints in constant time. So a better way to ask question 1 above might be "What (if any) is the justification for calling this operation $O(1)$? Because it's pragmatic? Conventional? Implied by the RAM model?
def is_equal(a: int, b: int) -> bool:
return a == b
int is_equal(int a, int b) {
return a == b;
}
But consider what is happening under the surface. Integers are just binary strings and, to determine equality, both languages will compare the strings bit-by-bit. In either case this scan is $O(b)$ where $b$ is the number of bits. Since integers have a constant size in bits in C, this is simply $O(1)$.
EDIT: C doesn't compare bit-by-bit see this answer
In Python 3 however, integers do not have fixed size and the scan remains $O(b)$ for the number of bits in the input, or $O(\log a)$ where $a$ is the value of the input in base 10.
So if you're analyzing code in Python, any time you compare two integers, you are embarking on a surprisingly complex journey of $O(\log n)$ with respect to the base 10 value of either number.
For me this raises several questions:
- Is this correct? I haven't seen anyone else claim that Python compares ints in log time.
- In the context of conducting an interview, should you notice or care if a candidate calls this $O(1)$?
- Should you notice or care about this distinction in the real world?
EDIT: It is easily verified (and intuitive) that Python cannot compare arbitrarily large ints in constant time. So a better way to ask question 1 above might be "What (if any) is the justification for calling this operation $O(1)$? Because it's pragmatic? Conventional? Implied by the RAM model?
Solution
Integers are just binary strings and, to determine equality, both languages will compare the strings bit-by-bit.
Not quite. C
If at least one of numbers can be bounded by $2^{30d}$ for any fixed $d$, the comparison is $O(1)$ (because the number of digits is compared first), and if they can't, other operations are likely of much more concern than equality comparison. So in practice I'd say it's very unlikely to matter and if it does you'll know (and would be using not
Not quite. C
ints are machine-word-sized and compared with a single machine instruction; Python ints are represented in base $2^{30}$ (see e.g. https://rushter.com/blog/python-integer-implementation/) and compared digit-by-digit in that base. So the relevant base of the logarithm is $2^{30}$.If at least one of numbers can be bounded by $2^{30d}$ for any fixed $d$, the comparison is $O(1)$ (because the number of digits is compared first), and if they can't, other operations are likely of much more concern than equality comparison. So in practice I'd say it's very unlikely to matter and if it does you'll know (and would be using not
ints but something like the GNU Multiple Precision Arithmetic Library in C as well).Context
StackExchange Computer Science Q#127939, answer score: 25
Revisions (0)
No revisions yet.