HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonCritical

Speed up solution to Project Euler problem 75

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
problemprojecteulersolutionspeed

Problem

I wrote this code for Project Euler problem 75, which asks how many integers ≤ 1500000 exist, where the integer is a perimeter length that can be divided into three integer-length sides of a right triangle in one unique way.

I was curious if anyone knew how to improve its speed. It runs fine, but I'm just looking to improve my own coding know-how.

from functools import reduce
import math

primes=set([2,3,5,7,11,13,17,19,23])
def isPrime(n):
    n=abs(n)
    if n in primes:
       return True
    if n<2:
       return False
    if n==2:
       return True
    if n%2==0:
       return False
    for x in range(3, int(n**0.5)+1, 2):
        if n % x == 0:
            return False
    primes.add(n)
    return True

def aFactors(n):
    if isPrime(n):
        return [1,n]
    return set(reduce(list.__add__,([i,n//i] for i in range(1,int(math.sqrt(n))+1) if n%i==0)))

count=0
number=12
while number<=1500000:
    p=number/2
    f=aFactors(p)
    triangles=[]
    pairs=[(i, int(p/i)) for i in f]
    add=triangles.append
    for i in pairs:
        mList=aFactors(i[0])
        pairsOfmc=[(k,int(i[0]/k)) for k in mList]
        for j in pairsOfmc:
            add((2*i[1]*i[0]-i[1]*i[1]*j[0],2*i[0]*i[1]-2*i[0]*j[1],2*i[0]*j[1]+i[1]*i[1]*j[0]-2*i[0]*i[1]))
    r=0
    while r<len(triangles):
        if any(triangles[r][i]<=0 for i in range(len(triangles[r]))):
            del triangles[r]
        else:
            l=list(triangles[r])
            l.sort()
            triangles[r]=tuple(l)
            r+=1
    trianglesFinal=list(set(triangles))
    for i in trianglesFinal:
        print(number, i)
    if len(trianglesFinal)==1:
        count+=1
    number+=2
print(count)


Please note that I am not looking for a different calculating method (I am sure there is one, but, for me, Project Euler is about finding your own methods. Using yours would, to me, be cheating). However, any faster functions, ways to combine blocks of code, simplified tests, or the like (such as no

Solution

For the record, this is Project Euler problem 75.

  1. Admonition



You write:


Please note that I am not looking for a different calculating method (I am sure there is one, but, for me, Project Euler is about finding your own methods. Using yours would, to me, be cheating.)

Abandon this attitude! A key part of programming — just as important as coding — is developing a wide repertoire of algorithms and techniques that you can apply to problems you face. Although for your own intellectual satisfaction it's great to discover algorithms on your own, you should always go on to compare your solution to the best solutions discovered by others, so that you can improve your repertoire for the next problem.

In particular, when you want to speed up a program, it's counterproductive to say, "I don't want to implement a different algorithm, I just want to speed up the code I wrote". The biggest speedups come from finding better algorithms.

You say that your program "runs fine" but that doesn't seem to be true. Project Euler says:


Each problem has been designed according to a "one-minute rule", which means that although it may take several hours to design a successful algorithm with more difficult problems, an efficient implementation will allow a solution to be obtained on a modestly powered computer in less than one minute.

I tried running your program, but after ten minutes it still had not produced an answer (and my laptop was running hot), so I killed it.

So in section 2 below I'll be using a smaller test size, 100,000 instead of 1,500,000, to keep the runtimes manageable.

  1. Piecewise optimization



-
Your program is hard to test from the interactive interpreter because it has code at the top level. Better to put the main program inside a function so that you can call it from interpreter and then add an if __name__ == '__main__': section so that you can run it as a script if you want to.

I've used the function name problem75, and it takes an argument, which is the largest value of the length of wire in the problem. Now:

>>> from timeit import timeit
>>> timeit(lambda:problem75(10**5),number=1)
12 (3, 4, 5)
24 (6, 8, 10)
30 (5, 12, 13)
[... much output deleted ...]
33.87288845295552


-
The print statement near the end of the main loop is unnecessary. This saves about 3 seconds, bringing the time down to 31.0 seconds.

-
Your function aFactors makes an initial call to isPrime, taking \$ O(\sqrt n) \$, in order to avoid a loop that also takes \$ O(\sqrt n) \$. The test costs as much as it saves, so it's not worth it. Remove the call to isPrime and rewrite the function like like this:

def factors(n):
    """Return the set of factors of n."""
    result = set([1, n])
    add = result.add
    for i in range(2, int(math.sqrt(n)) + 1):
        if n % i == 0:
            add(i)
            add(n // i)
    return result


Here I've picked a better name for the function, written a docstring explaining what it does, and cached the value of result.add so that it does not need to be looked up each time arounnd the loop. This saves about 4 seconds; time now 27.7 seconds.

-
Since the only thing you do with the set of factors is to turn it into pairs which you then iterate over, why not generate the pairs, like this:

def product_pairs(n):
    """Generate pairs (i, j) such that i * j = n."""
    yield 1, n
    yield n, 1
    for i in range(2, int(math.sqrt(n)) + 1):
        if n % i == 0:
            j = n // i
            yield i, j
            yield j, i


and then process them like this:

triangles = []
append = triangles.append
for i in product_pairs(number // 2):
    for j in product_pairs(i[0]):
        append((2*i[1]*i[0]-i[1]*i[1]*j[0],2*i[0]*i[1]-2*i[0]*j[1],2*i[0]*j[1]+i[1]*i[1]*j[0]-2*i[0]*i[1]))


This avoids having to construct an intermediate list in memory. This saves about 3 seconds; time now 24.3 seconds.

-
Instead of writing for i in product_pairs(...): and then looking up i[0] and i[1], split up the pair into two variables when you assign it in the loop, like this:

for i, j in product_pairs(number // 2):
    for k, l in product_pairs(i):
        append((2*j*i - j*j*k, 2*i*j - 2*i*l, 2*i*l + j*j*k - 2*i*j))


This avoids the sequence lookups. This saves about 3 seconds; time now 21.2 seconds.

-
You construct a list of triangles, some of which have sides with negative or zero length. You then go through the list and del the triangles which are invalid. But del on a list is a potentially expensive operation: it has to copy the remainder of the list to keep it contiguous. See the TimeComplexity page on the Python wiki for the cost of basic operations on Python's built-in data structures, where you can see that "delete item" operation on a list takes \$O(n)\$.

So don't add these invalid triangles to the list in the first place. In fact, don't bother with the list at all, just construct the set of triangles directly:

```
triangles =

Code Snippets

>>> from timeit import timeit
>>> timeit(lambda:problem75(10**5),number=1)
12 (3, 4, 5)
24 (6, 8, 10)
30 (5, 12, 13)
[... much output deleted ...]
33.87288845295552
def factors(n):
    """Return the set of factors of n."""
    result = set([1, n])
    add = result.add
    for i in range(2, int(math.sqrt(n)) + 1):
        if n % i == 0:
            add(i)
            add(n // i)
    return result
def product_pairs(n):
    """Generate pairs (i, j) such that i * j = n."""
    yield 1, n
    yield n, 1
    for i in range(2, int(math.sqrt(n)) + 1):
        if n % i == 0:
            j = n // i
            yield i, j
            yield j, i
triangles = []
append = triangles.append
for i in product_pairs(number // 2):
    for j in product_pairs(i[0]):
        append((2*i[1]*i[0]-i[1]*i[1]*j[0],2*i[0]*i[1]-2*i[0]*j[1],2*i[0]*j[1]+i[1]*i[1]*j[0]-2*i[0]*i[1]))
for i, j in product_pairs(number // 2):
    for k, l in product_pairs(i):
        append((2*j*i - j*j*k, 2*i*j - 2*i*l, 2*i*l + j*j*k - 2*i*j))

Context

StackExchange Code Review Q#25388, answer score: 52

Revisions (0)

No revisions yet.