patternpythonMinor
Number of divisors of a factorial
Viewed 0 times
factorialnumberdivisors
Problem
I was solving the DIVFACT problem from Sphere Online Judge:
Given a number, find the total number of divisors of the factorial of the number.
Since the answer can be very large, print the answer modulo \$10^9+7\$.
Input
The first line contains \$T\$, the number of test cases
\$T\$ lines follows each containing the number \$N\$.
Constraints:
\$ 1 \le T \le 500\$
\$ 0 \le N \le 50000\$
To solve this I took the following approach: I first calculated the primes up to 50000 using a sieve, then for each prime \$p\$ computed the power of \$p\$ in \$N!\$ given by:
$$ \mathrm{power}(p) = \frac{n}{p} + \frac{n}{p^2} + \frac{n}{p^3} + \ldots $$
… until I got a \$p^\mathrm{power}\$ greater than \$N\$.
Finally I computed the answer for each prime \$p\$:
$$ (\mathrm{power}(p_1)+1) \cdot (\mathrm{power}(p_2)+1) \cdot \ldots \cdot (\mathrm{power}(p_k)+1)$$
assuming there are \$k\$ prime numbers less than 50000.
but this code and algo is not fast enough to pass the test cases in the given time limit.Is there any way to improve this code to make it faster
Given a number, find the total number of divisors of the factorial of the number.
Since the answer can be very large, print the answer modulo \$10^9+7\$.
Input
The first line contains \$T\$, the number of test cases
\$T\$ lines follows each containing the number \$N\$.
Constraints:
\$ 1 \le T \le 500\$
\$ 0 \le N \le 50000\$
To solve this I took the following approach: I first calculated the primes up to 50000 using a sieve, then for each prime \$p\$ computed the power of \$p\$ in \$N!\$ given by:
$$ \mathrm{power}(p) = \frac{n}{p} + \frac{n}{p^2} + \frac{n}{p^3} + \ldots $$
… until I got a \$p^\mathrm{power}\$ greater than \$N\$.
Finally I computed the answer for each prime \$p\$:
$$ (\mathrm{power}(p_1)+1) \cdot (\mathrm{power}(p_2)+1) \cdot \ldots \cdot (\mathrm{power}(p_k)+1)$$
assuming there are \$k\$ prime numbers less than 50000.
def prime_numbers(limit=50001):
'''Prime number generator. Yields the series
2, 3, 5, 7, 11, 13, 17, 19, 23, 29 ...
using Sieve of Eratosthenes.
'''
yield 2
sub_limit = int(limit**0.5)
flags = [True, True] + [False] * (limit - 2)
# Step through all the odd numbers
for i in range(3, limit, 2):
if flags[i]:
continue
yield i
# Exclude further multiples of the current prime number
if i =num**exp:
s+=int(n/(num**exp))
exp+=1
pr*=(s+1)
# d.append(s)
else:
break
print pr%1000000007but this code and algo is not fast enough to pass the test cases in the given time limit.Is there any way to improve this code to make it faster
Solution
Things should be fine if you perform strength reduction on
In the kth iteration of the inner loop you have a quotient
To obtain
Here's the complete loop, coded in a more pedestrian language:
It isn't very fast - it clocked 0.12 s on SPOJ - but it should show the principle well enough and it can serve as the basis of a more performant solution.
Note: detailed investigation shows that for once, the lackluster performance of my C# submission on SPOJ is not due to usual problems inherent in .Net (e.g. poor input/output performance) but rather to some strange problem with Mono. On my laptop a worst-case input file clocks 13.7 ms under .Net and 136.4 ms under Mono.
Here is an attempt at rending the loop in Python (renaming some things for clarity):
One thing worth looking into is the underlying data type of the divisor count (
Reducing the division count modulo 1000000007 after every update by changing
to
improved the time for a worst-case input file from 1.92 s to 1.39 s on my box, so it's something worth keeping in mind. For comparison: the original code clocked 2.64 s for this file.
Belated addition: a few days after the fact an idea hit me out of the blue while I was having a smoke, and I'm reporting it here because it can speed up things by a factor of 20 for the SPOJ limit of 50000 (much more for higher inputs). This idea is based on the observation that half of the primes p satisfy
and hence
which is just a coy way of saying
Knowing that there are
(1 + 1)**k
This readily extends to other small quotients in the remaining range of primes, and the fun only stops when the primes get so small that their squares appear as divisors (i.e.
Here's how you explain this to your computer in the pythonic manner:
Here are the timings for the usual 500 worst-case inputs, first for the earlier version (here called 'v5b') and then for this new version. I went beyond the SPOJ limit of 50000 to give some idea of the growth behaviour.
Beyond 100000 or thereabouts the code gets faster with a hand-coded power
n / num**exp. That not only makes things faster, it also gets rid of the messiness inherent in Python's exponentiation operator (so that you don't have to find out how it actually works under the hood).In the kth iteration of the inner loop you have a quotient
q_k that is equal to n / pk (i.e. n / numexp in your code). To obtain
n / p(k+1) in the next iteration, all you have to do is q_k / p because that makes it (n / pk) / p, which is equal to n / p**(k+1) as desired. In other words, instead of computing the power of the prime anew and then performing a division, simply divide the old quotient by the prime again.Here's the complete loop, coded in a more pedestrian language:
static ushort[] primes_up_to_LIMIT = U16.SmallPrimesUpTo(50000).ToArray();
...
long divisor_count = 1;
foreach (int prime in primes)
{
if (prime > n)
break;
int e = 0;
for (int q = n; q >= prime; )
e += q /= prime;
divisor_count = (divisor_count * (e + 1)) % 1000000007;
}
return (int)divisor_count;
It isn't very fast - it clocked 0.12 s on SPOJ - but it should show the principle well enough and it can serve as the basis of a more performant solution.
Note: detailed investigation shows that for once, the lackluster performance of my C# submission on SPOJ is not due to usual problems inherent in .Net (e.g. poor input/output performance) but rather to some strange problem with Mono. On my laptop a worst-case input file clocks 13.7 ms under .Net and 136.4 ms under Mono.
Here is an attempt at rending the loop in Python (renaming some things for clarity):
divisor_count = 1
for current_prime in primes:
if current_prime > n:
break
q = n
e = 0
while q >= current_prime:
q /= current_prime
e += q
divisor_count *= e + 1
print divisor_count % 1000000007One thing worth looking into is the underlying data type of the divisor count (
pr in the original code). This can grow quickly, far beyond anything representable in a plain integer, and hence it will go big integer internally. This makes the multiplication pr = s + 1 more expensive, and the statement has a cost multiplier of max(T) pi(max(N)) = 500 * 5133. Reducing the division count modulo 1000000007 after every update by changing
divisor_count *= e + 1to
divisor_count = (divisor_count * (e + 1)) % 1000000007improved the time for a worst-case input file from 1.92 s to 1.39 s on my box, so it's something worth keeping in mind. For comparison: the original code clocked 2.64 s for this file.
Belated addition: a few days after the fact an idea hit me out of the blue while I was having a smoke, and I'm reporting it here because it can speed up things by a factor of 20 for the SPOJ limit of 50000 (much more for higher inputs). This idea is based on the observation that half of the primes p satisfy
(n / 2) < p <= nand hence
1 <= (n / p) < 2which is just a coy way of saying
n / p = 1Knowing that there are
k primes that satisfy this condition (bisect_right() to the rescue) I can compute the contribution for all these primes in one fell swoop as(1 + 1)**k
This readily extends to other small quotients in the remaining range of primes, and the fun only stops when the primes get so small that their squares appear as divisors (i.e.
p <= sqrt(n)).Here's how you explain this to your computer in the pythonic manner:
def v6b (n):
"""number of divisors of n! (SPOJ DIVFACT)"""
divisor_count = 1
hi = bisect.bisect_right(primes, n)
for m in range(2, int(math.sqrt(n))):
lo = bisect.bisect_right(primes, n / m, 0, hi)
k = hi - lo
hi = lo
divisor_count = (divisor_count * m**k) % 1000000007
if k = current_prime:
q /= current_prime
e += q
divisor_count *= e + 1
return divisor_count % 1000000007Here are the timings for the usual 500 worst-case inputs, first for the earlier version (here called 'v5b') and then for this new version. I went beyond the SPOJ limit of 50000 to give some idea of the growth behaviour.
# timing v5b ...
500 @ 500: 0.016 // cksum 230711749546
500 @ 1000: 0.016 // cksum 256088767231
500 @ 5000: 0.078 // cksum 248659270107
500 @ 10000: 0.156 // cksum 255839576411
500 @ 50000: 1.281 // cksum 252317442043
500 @ 100000: 3.313 // cksum 246710058695
500 @ 500000: 44.376 // cksum 244700225412
# timing v6b ...
500 @ 500: 0.000 // cksum 230711749546
500 @ 1000: 0.015 // cksum 256088767231
500 @ 5000: 0.016 // cksum 248659270107
500 @ 10000: 0.031 // cksum 255839576411
500 @ 50000: 0.063 // cksum 252317442043
500 @ 100000: 0.078 // cksum 246710058695
500 @ 500000: 0.266 // cksum 244700225412Beyond 100000 or thereabouts the code gets faster with a hand-coded power
Code Snippets
divisor_count = 1
for current_prime in primes:
if current_prime > n:
break
q = n
e = 0
while q >= current_prime:
q /= current_prime
e += q
divisor_count *= e + 1
print divisor_count % 1000000007divisor_count *= e + 1divisor_count = (divisor_count * (e + 1)) % 1000000007(n / 2) < p <= n1 <= (n / p) < 2Context
StackExchange Code Review Q#129974, answer score: 5
Revisions (0)
No revisions yet.