patterncppMinor
Optimizing digit sum and product with GMP
Viewed 0 times
gmpwithproductoptimizingsumanddigit
Problem
I have a bottleneck in my program at the point where I calculate the digit sum and product of a number. I am using GMP because I have to work with numbers that follow \$1e8 < n < 1e80\$.
To calculate the sum/product of a number, I initially used strings, thinking the division/modulo method would result in an improvement later on.
However, when I later implemented this it turned out that this did not give me a performance improvement.
Instead of:
In the actual program the values of
The test case and the program shows me (with or without optimizations) using the first method is 3x as slow. How can I improve the modulo version to run faster than the string version? Or should I use the string version? If so, can that be optimized?
To calculate the sum/product of a number, I initially used strings, thinking the division/modulo method would result in an improvement later on.
However, when I later implemented this it turned out that this did not give me a performance improvement.
mpz_class num = 1120390317;
mpz_class sum = 0;
mpz_class pro = 1;
while(num != 0) {
mpz_class rem = (num % 10);
sum += rem;
pro *= rem;
num = num / 10;
}Instead of:
mpz_class num = 1120390317;
mpz_class sum = 0;
mpz_class pro = 1;
const std::string strNum = num.get_str();
for (auto c : strNum) {
int res = c - '0';
sum += res;
pro *= res;
}In the actual program the values of
num are actually in the before mentioned range. But they provide results about equal to this test case.The test case and the program shows me (with or without optimizations) using the first method is 3x as slow. How can I improve the modulo version to run faster than the string version? Or should I use the string version? If so, can that be optimized?
Solution
There are a few ways to speed up your division. First, note that because
The key is to do only a single division per loop, but you'll have to resort to using the underlying C function since it's not exposed to the C++ interface. Fortunately, it's simple to do so:
On my machine this is slightly faster than the string method, and probably uses a little less memory.
Update:
Because the speed of that code relative to the string method varies based on what compiler optimizations are used, it's not necessarily always faster. This version uses low-level GMP calls for more speed:
This makes a couple of assumptions that are important. First, it assumes that
Also, note that within your range of integers, the sum of digits will always fit within a
rem is declared within the while loop, it's both constructed and destroyed every loop iteration. That's not really necessary, so you could instead declare it outside the loop and simply assign a new value within the loop each time. Also by changing the line from num = num / 10 to num /= 10, you avoid creating and destroying yet another temporary. Applying both of those saves a little time, but it's still slower than the string method because you're essentially performing the division twice (once for num % 10 and again for num / 10).The key is to do only a single division per loop, but you'll have to resort to using the underlying C function since it's not exposed to the C++ interface. Fortunately, it's simple to do so:
unsigned long digit;
while(num != 0) {
digit = mpz_tdiv_q_ui(num.get_mpz_t(), num.get_mpz_t(), 10);
sum += digit;
pro *= digit;
}On my machine this is slightly faster than the string method, and probably uses a little less memory.
Update:
Because the speed of that code relative to the string method varies based on what compiler optimizations are used, it's not necessarily always faster. This version uses low-level GMP calls for more speed:
__mpz_struct *q = num.get_mpz_t();
while(*(q->_mp_d) != 0) {
auto digit = mpn_divrem_1 (q->_mp_d, (mp_size_t) 0,
q->_mp_d, q->_mp_size, (mp_limb_t) 10);
sum += digit;
pro *= digit;
}This makes a couple of assumptions that are important. First, it assumes that
num > 0 and it dives deep into the internal structure of __mpz_struct so it will not be very durable to changes in GMP, should they decide to change the underlying representation of numbers. However, with this code, I find that it's a bit faster on my machine than the string method, both with and without -O3 optimizations.Also, note that within your range of integers, the sum of digits will always fit within a
long int which will save time, but could equally be applied to the string method.Code Snippets
unsigned long digit;
while(num != 0) {
digit = mpz_tdiv_q_ui(num.get_mpz_t(), num.get_mpz_t(), 10);
sum += digit;
pro *= digit;
}__mpz_struct *q = num.get_mpz_t();
while(*(q->_mp_d) != 0) {
auto digit = mpn_divrem_1 (q->_mp_d, (mp_size_t) 0,
q->_mp_d, q->_mp_size, (mp_limb_t) 10);
sum += digit;
pro *= digit;
}Context
StackExchange Code Review Q#52151, answer score: 4
Revisions (0)
No revisions yet.