patterncppMinor

Rounding ints division result to certain "granularity"

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

codereview cpp stackoverflow c++algorithm performance integer

resultdivisioncertaingranularityroundingints

Problem

In my program I call "granularity" the "minimum step" for a number. For example, if granularity == 4 then I only work with numbers "0, 4, 8, 12 etc" (so normal integer has granularity == 1).

I need a function to round an integer number to a certain granularity. So for example, if granularity == 3, I want this map:

// I don't use negative number so don't care about negative results
0 -> 0
1 -> 0
2 -> 3
3 -> 3
4 -> 3
5 -> 6
6 -> 6
7 -> 6
8 -> 9
and so on

This is my implementation:

#include 
#include 

using namespace std;

typedef int64_t myDecimal;

myDecimal Round(myDecimal val, myDecimal granularity)
{
    myDecimal rest = val % granularity;
    myDecimal candidat = val / granularity;
    if (2 * rest < granularity) {
        return candidat * granularity;
    } else {
        return (candidat + 1) * granularity;
    }
}

int main()
{
    for (int i = -10; i < 10; i++){
        std::cout << "i = " << i << " " << Round(i, 3) << std::endl;
    }
    return 0;
}

How can you further improve it for better latency as I'm using this in HFT trading?

Solution

Let me start with the usual:

Naming

myDecimal: Usually, the prefix "my" does not bear any sensible information. You should think about the information you want to pass by the types name. (For example Decimal is often the name for decimal fractions and the way in which numbers are stored in it!)

rest: in the context of a modulus this is called remainder

candidat: this is a candidat but where is the other one and which one is it? You should call this something like: flooredValue

Reusability

Everytime that you use this function you have to use the exact type or must cast, which is slow. You should think about making it a template function.

Performance

Most of this is micro optimizations but you seem to be interested in every last bit of time. I don't know how much of this is automatically done by your optimizing compiler:

go to assembler: This is very easy code that can be written in assembler. Especially the division and modulus are two functions that are slow (compared to others) and (at least) on x86 architectures the assembler instruction for division gives you the modulus at the same time. (likely optimized by the compiler)

multiplication by 2: this can be replaced by a left shift by one (very likely optimized by the compiler)

multiplication in return: instead of using a multiplication you can use an addition to return the results (return val - remainder; and return val + granularity - remainder;) (maybe not optimized by the compiler)

Note that all of those "optimizations" could be done automatically by the compiler and even more important might not be optimizations on modern architectures anymore. When it comes to performance you should compare on the assembler level and do extensive profiling.

Context

StackExchange Code Review Q#49310, answer score: 3

Revisions (0)

No revisions yet.