patternModerate

modular multiplication

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

algorithm-analysis cs modular-arithmetic stackoverflow

multiplicationmodularstackoverflow

Problem

I was reading the Modular Multiplication page on wikipedia...and could not understand the algorithm to compute $a \cdot b \pmod{m}$.

uint64_t mul_mod(uint64_t a, uint64_t b, uint64_t m)
{
  long double x;
  uint64_t c;
  int64_t r;
  if (a >= m) a %= m;
  if (b >= m) b %= m;
  x = a;
  c = x * b / m;
  r = (int64_t)(a * b - c * m) % (int64_t)m;
  return r < 0 ? r + m : r;
}

I get that:

$$ a \cdot b \pmod{m} \equiv ((a \mod{m}) \cdot (b \mod{m})) \mod{m}$$

the step I didn't get is c=xb/mand later doing r = (int64_t)(a b - c * m) % (int64_t)m; what is that for?

The article says:

by employing the trick that, by hardware, floating-point
multiplication results in the most significant bits of the product
kept, while integer multiplication results in the least significant
bits kept

I can't see how this is related !!!

Solution

Here's a more compact and mathematical description of what is going on. Let $a$ and $b$ be the input, already reduced modulo $m$, so $a m$ or $r_x+e N$ which means $m^2 > N > m$. Now the multiplication is going to wrap. We'll write $x = ab = z + kN$ for an integer $k N$ in which case you'd get the wrong answer. For $N = 2^{64}$ this means $m > 2^{63}$. (As an aside, if $2m = N$ it causes no issue since we'll end up with $0$.) If we configure the floating point hardware to round down so that $e \leq 0$, this case will not come up. (Though don't forget my restriction that the mantissa can hold $m$!).

Connecting this to most/least significant bits

To specifically address the part that you quoted, consider an integer, $B$, greater than 2 which we'll think of as a base, as in "base $10$". In the usual $B=10$ case, the number $21$ is represented as $21 = 2B+1$. In a floating point representation we'd write this as $2.1\times B^1$. Now let's say we wanted to multiply two (positive) single base-$B$ digit numbers, the result would require at most two digits, say $cB+d$ where (as required by a [standard] base-$B$ representation) $0 \leq c < B$ and $0\leq d < B$. If we're restricted to only being able to store one base-$B$ digit of the result, there are two obvious choices, either $c$ or $d$.

Choosing $d$ is equivalent to working mod $B$ as $cB + d = d\mod B$, this is what happens with integer arithmetic. (Incidentally, at the assembly level, integer multiplication often does produce both of these digits.)

Floating point arithmetic, on the other hand, effectively corresponds to choosing $c$, but compensating that by incrementing the exponent. In effect, we represent the result as $c.d\times B^1$ but since we can store only one base-$B$ digit, this becomes just $c\times B^1$. (In practice, we'll consider numbers as multi-digit numbers in a small base (i.e. 2), rather than 1- or 2-digit numbers in a large base. This allows us to save some of the higher digits of $d$ if they aren't needed to store $c$, but in the worst-case scenario all of $d$ is lost. None of $c$ is lost until we start running out of room in the exponent. For the code above, this is not an issue.)

As long as $m$ can be represented faithfully in the floating point format, the expression $\left\lfloor\frac{ab}{m}\right\rfloor$ can be viewed as extracting that upper digit in base-$m$. You can view the code and math above as the interplay between base-$N$ and base-$m$ representations of a number.

Practicalities

Based on section 5.2.4.2.2 of this draft, the C11 standard appears to only require long double to have a mantissa roughly 33 bits in length. (In particular, it appears to only specify the minimum number of decimal digits that can be faithfully represented.) In practice, most C compilers when targeting general purpose CPUs and particularly x86-family CPUs, will use IEEE754 types. In this case double will effectively have a 53-bit mantissa. x86-family CPUs support an 80-bit format with a mantissa with effectively 64-bits, and several but not all compilers will have long double indicate that when targeting x86. The range of validity of the code depends on these implementation details.

Context

StackExchange Computer Science Q#77016, answer score: 11

Revisions (0)

No revisions yet.