HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Summation of primes for large inputs

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
inputsprimeslargeforsummation

Problem

I'm doing a problem from Project Euler to find the sum of all primes under two million.

My code manages to get a pretty good result for small inputs, but when trying inputs like million or two it just won't finish.

import time
import math

def primes_sum(limit):
    num_list = range(2, limit)
    i, counter = 2, 1
    while i <= math.sqrt(limit):
        num_list = num_list[:counter] + filter(lambda x: x%i!=0, num_list[counter:])
        i = num_list[counter]
        counter +=1
    print sum(num_list)

def main():
    start_time = time.time()
    primes_sum(2000000)
    print 'Rum time: '+str(time.time()-start_time)

if __name__ == "__main__":
    main()


The error for 2 million:

Internal error: TypeError: FUNCTION_TABLE[HEAP[(HEAP[(c + 4)] + 28)]] is not a function

Solution

Your primes_sum function seems to be inspired by the sieve of Erathostenes, but there are a number of things in your implementation that don't work out too well:

  • Do not remove items from you num_list. That's an expensive operation which basically requires rewriting the whole list. Plus, it makes the following point impossible.



  • If you have a list of consecutive integers, and you know the position i of a number n, you do not need to do modulo operations to figure out its multiples. Instead, repeatedly add n to i. Addition is much cheaper than modulo.



  • You don't actually need to have a list of numbers: instead keep a list of boolean values, with their position in the list indicating the number they represent. Ideally this would be a bit array, for maximal memory saving. But a list of Python booleans will save you 4 or 8 bytes per entry already.



A very straightforward implementation of a sieve would look like:

def sum_primes(n):
    sieve = [False, False] + [True] * (n - 1)  # 0 and 1 not prime
    for number, is_prime in enumerate(sieve):
        if not is_prime:
            continue
        if number * number > n:
            break
        for not_a_prime in range(number * number, n+1, number): #If you use range(__, n, __), the function leaves the last Boolean in the sieve list as True, whether or not it is a prime. If you change it to range(__, n+1, __), this problem is taken care of. 
            sieve[not_a_prime] = False
    return sum(num for num, is_prime in enumerate(sieve) if is_prime)


On my system this spits the answer for 2 10^6 in under a second, manages to get 2 10^7 in a few, and can even produce 2 * 10^8 in well under a minute.

Code Snippets

def sum_primes(n):
    sieve = [False, False] + [True] * (n - 1)  # 0 and 1 not prime
    for number, is_prime in enumerate(sieve):
        if not is_prime:
            continue
        if number * number > n:
            break
        for not_a_prime in range(number * number, n+1, number): #If you use range(__, n, __), the function leaves the last Boolean in the sieve list as True, whether or not it is a prime. If you change it to range(__, n+1, __), this problem is taken care of. 
            sieve[not_a_prime] = False
    return sum(num for num, is_prime in enumerate(sieve) if is_prime)

Context

StackExchange Code Review Q#102291, answer score: 8

Revisions (0)

No revisions yet.