snippetMinor

Floating point format: why must `1−emax ≤ q+p−1 ≤ emax`?

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

floating-point cs number-formats stackoverflow

whyformatmustpointfloatingemax

Problem

From the Wikipedia page on the IEEE Standard for Floating-Point Arithmetic,

The possible finite values that can be represented in a format are determined by the base (b), the number of digits in the significand (precision, p), and the exponent (q) parameter emax:

...

q must be an integer such that 1−emax ≤ q+p−1 ≤ emax (e.g., if p=7 and emax=96 then q is −101 through 90).

I can't figure out the reasoning behind the above inequality. I would've thought (in my simplicity) that it would be -emax ≤ q ≤ emax or something similar. What am I missing?

Solution

The reason we get a larger range is denormalized numbers. Generally speaking, floating point numbers have three physical parts: sign (1 bit), mantissa $M$ and exponent $e$. Most of the time we think of the number as $\operatorname{sgn} \times 1.M \times 2^{e-e_0}$, where $e_0 = 2^{|e|}-1$ (e.g. for single precision, it's 127, since the exponent is allotted seven bits). Here "$1.M$" means the number you obtain by writing $M$ as a binary string and prefixing $1.$.

For reasons having to do with underflow (non-zero numbers turning to zero), it is important to be able to store numbers very close to zero. These numbers, named denormalized numbers or subnormalized numbers, have $e = 0$ and represent $\operatorname{sgn} \times 0.M \times 2^{1-e_0}$. (This means that normal numbers cannot have $e = 0$.) This explains the extended range mentioned in the Wikipedia page.

Other numbers having special encodings are $\pm \infty$ and NaN (not a number), which represent some illegal operation (division by zero, taking the logarithm of a non-positive number, taking the square root of a negative number, and so on). For more details, consult the Wikipedia page regarding the original standard.

Context

StackExchange Computer Science Q#21930, answer score: 3

Revisions (0)

No revisions yet.