patterncppMinor

Floating point comparison program

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

floatingcomparisonprogrampoint

Problem

I made a small program to compare floating-point values by running multiple tests to a single pair of floats. I have added extra checks for under/overflow and allowing tolerance.

How can I improve it further?

```
#include
#include
#include
#include
#include
#include
#include
#include

template
struct Types {
typedef void int_type;
};

template <>
struct Types {
typedef int int_type;
typedef unsigned int uint_type;
};

template <>
struct Types {
typedef __int64 int_type;
typedef unsigned __int64 uint_type;
};

template
class Float
{
public:
typedef typename Types::uint_type bit_type;
typedef typename T value_type;

static const bit_type bit_count = 8 * sizeof(value_type);
static const bit_type fraction_count = std::numeric_limits::digits - 1;
static const bit_type exponent_count = bit_count - 1 - fraction_count;

static const bit_type sign_mask = static_cast(1) (0) >> (exponent_count + 1);
static const bit_type exponent_mask = ~(sign_mask | fraction_mask);
static const bit_type max_ulps = static_cast(4);

explicit Float(const T& x) { value = x; }

const value_type &data_float() const { return value; }
const bit_type &data_bits() const { return bit; }

bit_type exponent_bits() const { return (exponent_mask & bit); }
bit_type sign_bits() const { return sign; }
bit_type fraction_bits() const { return fraction; }

bool is_infinity()
{
return ((bit & ~sign_mask) == exponent_mask);
}

bool is_nan() const {
bool nan = true;
nan &= (exponent_mask & bit) == exponent_mask;
nan &= (fraction_mask & bit) != static_cast(0);
return nan;
}

static bit_type to_biased(bit_type bits) {
return (sign_mask & bits) ? (~bits + 1) : (sign_mask | bits);
}

static bit_type distance(bit_type bits1, bit_type bits2) {
const bit_type biased1 = to_biased(bits1);
const bit_type biased2 = to_biased(bits2);
retu

Solution

A good effort with lots of useful debug to see the inner working of the task.

Biggest design concern: lack of control on how "almost" almost_equals() operates. I'd expect a 3rd, perhaps optional, parameter to gauge "closeness".

From a design stand-point, code is mixing absolute and relative in determining the the return of almost_equals() and is incapable of returning a sensible result based on just one of those. Perhaps 2 functions:

almost_equals_rel(value_type a, value_type b, uint_type ulp_abs_difference);
almost_equals_abs(value_type a, value_type b, value_type max_abs_difference);

This and that:

-
Code uses a fixed tolerance, yet uses templated code. A value of 0.000001 is quite arbitrary and not justified in code. Might make sense for float, yet not double. IMO, the tolerance assessment detracts from this code.

// Why the magic number 0.000001?
const T tolerance = static_cast(0.000001);

-
-0.0 has the same value of 0.0, yet incorrectly fails almost_equals() due to differ signs.

-
Values of 7, 10 and 15 throughout code should 1) be driven by a common named constants and 2) not a magic number. IMO, for typical float and double I would expect 9 and 17 significant digits printed to allow round-tripping the text back to the FP variable. (Of course this is mostly debug code.)

-
bool nan = (f1.is_nan() ^ f2.is_nan()) == 1; goes against the spirit of IEEE math. Even if both f1,f2 are both NaNs with the same bit pattern, they are not equal.

-
Unsigned fixed width types makes sense for double. I'd use int64_t int_type rather than __int64 int_type Potable code should use fixed width for struct Types

struct Types {
  //typedef int int_type;
  //typedef unsigned int uint_type;
  typedef int32_t int_type;
  typedef uint32_t uint_type;
};

-
For code to function correctly, it assumes the endian of integers and FP is the same. Certainly common, but not specified and exceptions exists.

union {
  value_type value;
  bit_type bit;

-
Further, code assumes the bit packing/bit fields are in the desired order.

struct {
    bit_type fraction : fraction_count;
    bit_type exponent : exponent_count;
    bit_type sign : 1;
};
// or should it be  
struct {
    bit_type sign : 1;
    bit_type exponent : exponent_count;
    bit_type fraction : fraction_count;
};

-
A comment that applies to C, unsure about C++: bit fields are well defined for integer types unsigned, signed int and maybe bool. So a bit field of type __int64 may lack portability. Higher portable code would use shifts and masks rather than bit fields to control endian, range and padding issues.

-
int main() only exercises float. Recommend adding a double example. Also compare values a,b that are both about 1e30 or 1e-30. The test set is much to small.

Code Snippets

almost_equals_rel(value_type a, value_type b, uint_type ulp_abs_difference);
almost_equals_abs(value_type a, value_type b, value_type max_abs_difference);

// Why the magic number 0.000001?
const T tolerance = static_cast<T>(0.000001);

struct Types<4> {
  //typedef int int_type;
  //typedef unsigned int uint_type;
  typedef int32_t int_type;
  typedef uint32_t uint_type;
};

union {
  value_type value;
  bit_type bit;

struct {
    bit_type fraction : fraction_count;
    bit_type exponent : exponent_count;
    bit_type sign : 1;
};
// or should it be  
struct {
    bit_type sign : 1;
    bit_type exponent : exponent_count;
    bit_type fraction : fraction_count;
};

Context

StackExchange Code Review Q#153906, answer score: 5

Revisions (0)

No revisions yet.