patternpythonMinorCanonical
Project Euler #14: Longest Collatz sequence
Viewed 0 times
sequenceprojecteulerlongestcollatz
Problem
Problem : source
Which starting number, under one million, produces the longest chain?
n → n/2 (n is even)
n → 3n + 1 (n is odd)
I want to receive advice on my code.
It is too slow... It takes 514 seconds
Which starting number, under one million, produces the longest chain?
n → n/2 (n is even)
n → 3n + 1 (n is odd)
I want to receive advice on my code.
It is too slow... It takes 514 seconds
import math
from functools import reduce
import time
start_time = time.time()
target = int(math.pow(10,6))
first_list = list(map(lambda x : x , range(1,target+1)))
def chain_operate(given_list):
given_list = [0 if x == 1 else x for x in given_list]
new_list = list(map(lambda x : x/2 if x%2==0 else 3*x+1 , given_list))
new_list = [0 if x == 1 else x for x in new_list]
# replace 1 to 0 because of below condition : while max(new_list) != reduce(lambda x,y : x+y, new_list)
while max(new_list) != reduce(lambda x,y : x+y, new_list): # when the list comes to (0,0,0,4,0,0), break
new_list = chain_operate(new_list)[0] #recursive method
return (new_list,new_list.index(max(new_list)))
print(chain_operate(first_list)[1]+1)
print("--- %s seconds ---" % (time.time() - start_time))Solution
Your algorithm involves the following:
a. Make a new list of size 1,000,000
b. Make another new list of size 1,000,000
c. Make another new list of size 1,000,000
d. Iterate over it twice
e. And then another time
That strikes me as extremely expensive to start out. Furthermore, consider all the extra work we're doing. The sequence in the problem example is:
You are calculating all 10 of those chains independently. That means you're calculating the chain for
Just looking at some runtimes for small values of
A few code comments before I get into a simpler algorithm. Don't use
Can be replaced by:
The latter could be wrapped further to avoid the extra list comprehension just to move your
Could be replaced by:
Better Algorithm
First, let's start with our handy
And then start separating our concerns. We will write a function that for ONE value of
And then we just need to wrap all of that:
It's still linear time, but I end up doing FAR less work than you have to, since most of the time I'm doing almost nothing. And the numbers show it:
- Make a list of size 1,000,000
- Until done:
a. Make a new list of size 1,000,000
b. Make another new list of size 1,000,000
c. Make another new list of size 1,000,000
d. Iterate over it twice
e. And then another time
That strikes me as extremely expensive to start out. Furthermore, consider all the extra work we're doing. The sequence in the problem example is:
13 --> 40 --> 20 --> 10 --> 5 --> 16 --> 8 --> 4 --> 2 --> 1You are calculating all 10 of those chains independently. That means you're calculating the chain for
16 six times. That's an enormous amount of extra work. The actual longest chain is of length 525. That means that for the actual answer, we are doing 524 more iterations than necessary, and we end up having to create over 1500 lists of size one million. Even if most of the time, most of our elements will be 0! We can do way better by just memoizing, which also takes less memory to boot! Just looking at some runtimes for small values of
target (10x each via timeit):target OP
100 0.09s
1000 1.29s
10000 19.07sA few code comments before I get into a simpler algorithm. Don't use
map(). It's far more complicated than a list comprehension. These two:first_list = list(map(lambda x : x , range(1,target+1)))
new_list = list(map(lambda x : x/2 if x%2==0 else 3*x+1 , given_list))Can be replaced by:
first_list = range(1, target+1)
new_list = [x/2 if x%2 == 0 else 3*x+1 for x in given_list]The latter could be wrapped further to avoid the extra list comprehension just to move your
1s to 0s but it's overkill anyway. Similarly, your:reduce(lambda x,y : x+y, new_list)Could be replaced by:
sum(new_list)Better Algorithm
First, let's start with our handy
memoize decorator:def memoize(f):
cache = {}
def wrapper(*args):
if not args in cache:
cache[args] = f(*args)
return cache[args]
return wrapperAnd then start separating our concerns. We will write a function that for ONE value of
n finds the length of a chain:@memoize
def collatz_chain(n):
if n == 1:
return 1
elif n % 2 == 0:
return 1 + collatz_chain(n/2)
else:
return 1 + collatz_chain(3*n+1)And then we just need to wrap all of that:
return max(range(1, target+1), key=collatz_chain)It's still linear time, but I end up doing FAR less work than you have to, since most of the time I'm doing almost nothing. And the numbers show it:
target OP memoize
100 0.09s 0.003s ~30x
1000 1.29s 0.024s ~54x
10000 19.07s 0.239s ~80xCode Snippets
13 --> 40 --> 20 --> 10 --> 5 --> 16 --> 8 --> 4 --> 2 --> 1target OP
100 0.09s
1000 1.29s
10000 19.07sfirst_list = list(map(lambda x : x , range(1,target+1)))
new_list = list(map(lambda x : x/2 if x%2==0 else 3*x+1 , given_list))first_list = range(1, target+1)
new_list = [x/2 if x%2 == 0 else 3*x+1 for x in given_list]reduce(lambda x,y : x+y, new_list)Context
StackExchange Code Review Q#105375, answer score: 6
Revisions (0)
No revisions yet.