patternpythonMinor
Performance in Hackerrank challenge "Sherlock and Queries"
Viewed 0 times
challengeandsherlockperformancehackerrankqueries
Problem
I translated the solution by the author to his challenge from C++ to Python.
It seems to be correct, but I get timeouts for the test cases 6, 9 and 12.
Here is the full challenge description:
Watson gives to Sherlock an array: \$A_1, A_2, ..., A_N\$. He also gives to
Sherlock two other arrays: \$B_1, B_2, ..., B_M\$ and \$C_1, C_2, ..., C_M\$.
Then Watson asks Sherlock to perform the following program:
Can you help Sherlock and tell him the resulting array
print all the array elements modulo \$(10^9 + 7)\$.
Input Format
The first line contains two integer numbers \$N\$ and \$M\$. The next line
contains \$N\$ integers, the elements of array
contains \$M\$ integers each, the elements of array
Output Format
Print \$N\$ integers, the elements of array
modulo \$(10^9 + 7)\$.
Constraints
\$1 ≤ N, M ≤ 10^5\$
\$1 ≤ B[i] ≤ N\$
\$1 ≤ A[i], C[i] ≤ 10^5\$
Sample Input
Sample Output
And here is my code:
Any idea on how I could increase the performance of my solution?
It seems to be correct, but I get timeouts for the test cases 6, 9 and 12.
Here is the full challenge description:
Watson gives to Sherlock an array: \$A_1, A_2, ..., A_N\$. He also gives to
Sherlock two other arrays: \$B_1, B_2, ..., B_M\$ and \$C_1, C_2, ..., C_M\$.
Then Watson asks Sherlock to perform the following program:
for i = 1 to M do
for j = 1 to N do
if j % B[i] == 0 then
A[j] = A[j] * C[i]
endif
end do
end doCan you help Sherlock and tell him the resulting array
A? You shouldprint all the array elements modulo \$(10^9 + 7)\$.
Input Format
The first line contains two integer numbers \$N\$ and \$M\$. The next line
contains \$N\$ integers, the elements of array
A. The next two linescontains \$M\$ integers each, the elements of array
B and C.Output Format
Print \$N\$ integers, the elements of array
A after performing the programmodulo \$(10^9 + 7)\$.
Constraints
\$1 ≤ N, M ≤ 10^5\$
\$1 ≤ B[i] ≤ N\$
\$1 ≤ A[i], C[i] ≤ 10^5\$
Sample Input
4 3
1 2 3 4
1 2 3
13 29 71Sample Output
13 754 2769 1508And here is my code:
import fileinput
import collections
lines = [[int(x) for x in line.split()] for line in fileinput.input()]
[n, m], a, b, c = lines
def one():
return 1
factors = collections.defaultdict(one)
for i in range(0, m):
factors[b[i]] *= c[i] % 1000000007
for i, factor in factors.iteritems():
for j in range(1, n/i + 1):
idx = j*i-1
a[idx] *= factor
a[idx] %= 1000000007;
print ' '.join(map(str, a))Any idea on how I could increase the performance of my solution?
Solution
This loop can be tuned in some ways:
Modified:
The
As noted in comments, the above changes do not make much difference. However, I now found something that actually matters. There is a subtle bug in the translation from C++ to Python here:
Written like this the modulo operator applies to
for i, factor in factors.iteritems():
for j in range(1, n/i + 1):
idx = j*i-1
a[idx] *= factor
a[idx] %= 1000000007;- Use the third argument of
range(step size).
- Use
xrangein Python 2.
- The in-place operators may actually slow you down vs. a single expression.
Modified:
for i, factor in factors.iteritems():
for idx in xrange(i-1, n, i):
a[idx] = a[idx] * factor % 1000000007The
fileinput module offers convenience for reading multiple files, but adds some overhead. If you are reading from standard input, use for line in sys.stdin directly. As noted in comments, the above changes do not make much difference. However, I now found something that actually matters. There is a subtle bug in the translation from C++ to Python here:
factors[b[i]] *= c[i] % 1000000007Written like this the modulo operator applies to
c[i] while it is supposed to apply to the result of the multiplication. The programs still gives the correct result because you apply the modulo in the later loop, but in the meanwhile Python has to multiply much larger numbers. I get a 100-fold speed increase by changing the line tofactors[b[i]] = factors[b[i]] * c[i] % 1000000007Code Snippets
for i, factor in factors.iteritems():
for j in range(1, n/i + 1):
idx = j*i-1
a[idx] *= factor
a[idx] %= 1000000007;for i, factor in factors.iteritems():
for idx in xrange(i-1, n, i):
a[idx] = a[idx] * factor % 1000000007factors[b[i]] *= c[i] % 1000000007factors[b[i]] = factors[b[i]] * c[i] % 1000000007Context
StackExchange Code Review Q#62956, answer score: 3
Revisions (0)
No revisions yet.