patterncppMinor

C++ implementation of Java's floatToIntBits() and intBitsToFloat()

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

codereview cpp stackoverflow c++c++11 floating-point

floattointbitsjavaintbitstofloatandimplementation

Problem

I am trying to implement Java's floatToIntBits() and intBitsToFloat() methods in C++. The latter method is the inverse of the former method, and the purpose of the former one is to pack a 32-bit floating point number into a portable 32-bit integer format. More specifically, my replacement function float2ul() for floatToIntBits() aims at the followings:

It should be portable across platforms that have 8-bit char types and support IEEE floating point arithmetic.

It returns a representation of the specified floating-point value according to the IEEE 754 floating-point "single format" bit layout.

Bit 31 (the bit that is selected by the mask 0x80000000UL) represents the sign of the floating-point number (0 means positive and 1 means negative). Bits 30-23 (the bits that are selected by the mask 0x7f800000UL) represent the exponent. Bits 22-0 (the bits that are selected by the mask 0x007fffffUL) represent the significand of the floating-point number.

If the argument is positive infinity, the result is 0x7f800000UL.

If the argument is negative infinity, the result is 0xff800000UL.

If the argument is NaN, the result is 0x7fc00000UL. C++ does not seem to offer a standard way to distinguish a quiet NaN from a signaling NaN, so all NaNs here are encoded with the same value.

In all cases, if the float type has exactly 32 bits, the result is a unsigned long that, when given to my replacement function ul2float() for Java's intBitsToFloat() method, will produce a floating-point value the same as the argument to ul2float (except all NaN values are collapsed to the single "canonical" NaN value 0x7fc00000UL).

If the size of the float type on the target platform is greater than 32 bits, the caller is responsible for checking whether the number is within the range of 32-bit floating point finite numbers. If x falls inside the range of 32-bit floating point numbers but its precision is high to fit in 32 bits, the statement `ul2float(float2ul(x)) -

Solution

I can't really speak to the portability of such code, as I honestly have no experience with anything other than IEEE754. Since you are targeting C++11, I can suggest that you can replace your compile-time calculation of powers of two in an easier way, using constexpr:

constexpr double pow_two(int i, double d=1)
{
    return (i == 0) ? d : pow_two(i-1, d * 2.0);
}

Which when called like so:

constexpr auto x = pow_two(7);

will be evaluated at compile time.

Also, avoid names like _p (starting with _), as the rules around what is and is not reserved for compiler usage are arcane, and its best to just avoid prefixing anything with _.

Code Snippets

constexpr double pow_two(int i, double d=1)
{
    return (i == 0) ? d : pow_two(i-1, d * 2.0);
}

constexpr auto x = pow_two(7);

Context

StackExchange Code Review Q#62588, answer score: 3

Revisions (0)

No revisions yet.