patternpythonModerate
Optimizing Project Euler Solution #12
Viewed 0 times
projectoptimizingeulersolution
Problem
Just to reiterate my prev. post, as I think it helps structure my weaknesses.
-
Coding methodology: Is the algorithm I used feasible or mostly correct? If not, how do I correct it and why won't it work or scale? I realize in some cases there are always going to be optimal solutions that will be over my head mathematically, but generally you can yield a solution that is within a reasonable range of optimal that is passable without knowing a ton of number theory in my experience. Where possible, I hope to learn algorithms along the way (such as the Sieve of Eratosthenes for prime ranges).
-
Coding execution: Given some pseudocode that we determine algorithmically will work, is the code written executed without redundancies and minimizing runtime?
-
Being more pythonic: Is something I'm not doing pythonic or can it be done simpler with some sort of library I don't know about? I believe this is mostly semantic, though in some cases can have large effects on performance (for instance, using
I submitted the following code as a solution to Project Euler's #12. It's extremely slow, running in mean 7.13 seconds over 100 trials (I think; just using other submitted solutions as a benchmark. It seems as though solutions <1 seconds are generally decent.)
The code for the
-
Coding methodology: Is the algorithm I used feasible or mostly correct? If not, how do I correct it and why won't it work or scale? I realize in some cases there are always going to be optimal solutions that will be over my head mathematically, but generally you can yield a solution that is within a reasonable range of optimal that is passable without knowing a ton of number theory in my experience. Where possible, I hope to learn algorithms along the way (such as the Sieve of Eratosthenes for prime ranges).
-
Coding execution: Given some pseudocode that we determine algorithmically will work, is the code written executed without redundancies and minimizing runtime?
-
Being more pythonic: Is something I'm not doing pythonic or can it be done simpler with some sort of library I don't know about? I believe this is mostly semantic, though in some cases can have large effects on performance (for instance, using
map() instead of a for loop to apply int()).I submitted the following code as a solution to Project Euler's #12. It's extremely slow, running in mean 7.13 seconds over 100 trials (I think; just using other submitted solutions as a benchmark. It seems as though solutions <1 seconds are generally decent.)
def factors(n):
return set(reduce(list.__add__,
([i, n//i] for i in range(1, int(n**0.5) + 1) if n % i == 0)))
def triangle_number(n):
for i in range(n):
n += i
return n
n = 1
while True:
if len(factors(triangle_number(n))) >= 500:
print n, triangle_number(n)
break
else:
n +=1The code for the
factors function was borrowed from here.Solution
This is an addition to Caridorc's answer.
The usual reason for using
In your case, however, there's a much more serious problem. You've got a Shlemiel the painter's algorithm!
When you write
Do you see the problem? This code is quadratic in both runtime and memory usage.
Here's some hard data. Consider the following function:
I ran this for values of
This definitely looks quadratic, and the coefficient of determination is R2 = 0.99655, which indicates an extremely strong correlation.
When we instead use
the time for
So, tl;dr, yes, use iterators—but make sure you know why!
So when use
Just a review of what
This is extended to general functions as follows:
You should use
Conceptually, you use reduce when you want to collapse a bunch of things into one value.
It may help to think of
Notes:
Before noting some examples, it may be helpful to realize that some operations from
A few quick examples:
I don't use reduce all that much in Python, but you should definitely know when to use it.
The usual reason for using
itertools is that you don't want to construct lists when you don't need to. For example, in Python 2, you'd write for i in xrange(100) instead of for i in range(100) so that you don't have to construct a whole list of 100 numbers when you don't need it. If this doesn't make sense to you, familiarize yourself with the difference between generator expressions (genexprs) and list types.In your case, however, there's a much more serious problem. You've got a Shlemiel the painter's algorithm!
When you write
reduce(list.__add__, list_of_lists), you're essentially writing list_1 + ... + list_n. Python will evaluate this as follows:- Take
list_1andlist_2. Create a new list object of the proper size, and copy the elements from lists 1 and 2 into this new list. Call itintermediate_1.
- Take
intermediate_1andlist_3. Create a new list of the proper size, and copy the elements from these two lists into the new list. Call itintermediate_2.
- …
- Take
intermediate_(n-1)andlist_n. Create a new list of the proper size, and copy the elements from these two lists into the new list. This is the final result.
Do you see the problem? This code is quadratic in both runtime and memory usage.
Here's some hard data. Consider the following function:
def fn(n):
lists = [[0] * 10000 for i in xrange(n)]
combined = reduce(list.__add__, lists)I ran this for values of
n of 1, 10, 100, 200, 250, 300, 400, and 500 (using IPython's %timeit fn(100), which does proper timing), and got the following results:This definitely looks quadratic, and the coefficient of determination is R2 = 0.99655, which indicates an extremely strong correlation.
When we instead use
def fast(n):
combined = list(itertools.chain.from_iterable(
[0] * 10000 for i in xrange(n)))the time for
fast(500) is just 65.8 ms instead of 7.42 s. (This shows that it is indeed the Shlemiel factor and not just large lists that's causing the slowdown.)So, tl;dr, yes, use iterators—but make sure you know why!
So when use
reduce?Just a review of what
reduce does for binary operators:reduce(int.__add__, [x1, x2, x3], x0) = (((x0 + x1) + x2) + x3)This is extended to general functions as follows:
reduce(f, [x1, x2, x3], x0) = f(f(f(x0, x1), x2), x3)You should use
reduce when you want to write code that looks like either of the above examples. Basically, reduce just saves you a loop.Conceptually, you use reduce when you want to collapse a bunch of things into one value.
It may help to think of
reduce in comparison with its other core functional operations:- Use
mapwhen you want to transform a bunch of values to other things (like "reverse all these strings"). You usemapin list comprehensions when you write[s[::-1] for s in strings].
- Use
filterwhen you want to take a bunch of values and keep just some of them (like "take just the strings that are palindromes"). You usereducein list comprehensions when you write[s for s in strings if s == s[::-1]].
- Use
reducewhen you want to collapse a bunch of things into one value (like "concatenate all these strings"). You can't usereducein list comprehensions, because you're not creating a new list (just a value).
Notes:
- The problem in your case is that the version with
list_1 + ... + list_nis just as bad in terms of efficiency.reducewasn't your problem; it was whatreduceexpanded to.
- With
reduce, you can omit the third argumentx0, andreducewill use the first element of the sequence, or raise aTypeErrorif the sequence is empty.
Before noting some examples, it may be helpful to realize that some operations from
__builtins__ are just reduce under the hood. For example:sum(ints) = reduce(int.__add__, ints, 0);
all(bools) = reduce(lambda x, y: x and y, bools, True), and, similarly;
any(bools) = reduce(lambda x, y: x or y, bools, False);
min(ints)is likereduce(lambda x, y: x if x
- max(ints)
is likereduce(lambda x, y: x if x > y else y, ints)`.
A few quick examples:
def factorial(n):
return reduce(int.__mul__, xrange(1, n + 1), 1)
def replay_chess_game(moves):
return reduce(lambda board, move: board.then(move),
moves,
INITIAL_BOARD)
def perform_git_rebase(commits):
return reduce(lambda head, commit: head.cherry_pick(commit),
commits,
git_globals.refs.HEAD)I don't use reduce all that much in Python, but you should definitely know when to use it.
Code Snippets
def fn(n):
lists = [[0] * 10000 for i in xrange(n)]
combined = reduce(list.__add__, lists)def fast(n):
combined = list(itertools.chain.from_iterable(
[0] * 10000 for i in xrange(n)))reduce(int.__add__, [x1, x2, x3], x0) = (((x0 + x1) + x2) + x3)reduce(f, [x1, x2, x3], x0) = f(f(f(x0, x1), x2), x3)def factorial(n):
return reduce(int.__mul__, xrange(1, n + 1), 1)
def replay_chess_game(moves):
return reduce(lambda board, move: board.then(move),
moves,
INITIAL_BOARD)
def perform_git_rebase(commits):
return reduce(lambda head, commit: head.cherry_pick(commit),
commits,
git_globals.refs.HEAD)Context
StackExchange Code Review Q#105202, answer score: 10
Revisions (0)
No revisions yet.