HiveBrain v1.2.0
Get Started
← Back to all entries
gotchapythonMinor

Why does my implementation of the Sieve of Eratosthenes take so long?

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
whythetakelongsievedoesimplementationeratosthenes

Problem

def sieve_of_erastothenes(number):
    """Constructs all primes below the number"""
    big_list = [2] + range(3, number+1, 2)  # ignore even numbers except 2
    i = 1  # start at n=3, we've already done the even numbers
    while True:
        try:
            # use the next number that has not been removed already
            n = big_list[i]
        except IndexError:  # if we've reached the end of big_list we're done here
            return big_list
        # keep only numbers that are smaller than n or not multiples of n
        big_list = [m for m in big_list if m<=n or m%n != 0]
        i += 1


On my computer, this code takes about half a second to calculate all primes below 10,000, fifteen seconds to do 100,000, and a very long time to calculate primes up to one million. I'm pretty sure it's correct (the first few numbers in the returned sequence are right), so what can I do to optimise it?

I've tried a version that removes the relevant numbers in-place but that takes even longer (since it takes \$O(n)\$ to move all the remaining numbers down).

Solution

The problem is that the way this is set up, the time complexity is (close to) \$O(n**2)\$. For each remaining prime, you run over the entire list.

In the last case, where you want to check for prime numbers up to 1,000,000 you end up iterating over your list 78,498 ** 2 times. 6,161,936,004 iterations is a lot to process.

Below is an approach that crosses off the numbers instead, editing the large list of numbers in-place. This makes for a \$O(n)\$ solution, which times as such:

  • range 10,000 : 549 usec per loop



  • range 100,000 : 14.8 msec per loop



  • range 1,000,000: 269 msec per loop



  • range 10,000,000: 3.26 sec per loop



def sieve_of_erastothenes(limit):
    """Create a list of all numbers up to `number`, mark all non-primes 0,
    leave all the primes in place. Filter out non-primes in the last step.
    """
    sieve = range(limit + 1)

    # There are no prime factors larger than sqrt(number):
    test_max = int(math.ceil(limit ** 0.5))

    # We know 1 is not a prime factor
    sieve[1] = 0

    # The next is a cheat which allows us to not even consider even numbers
    # in the main loop of the program. You can leave this out and replace the
    # current xrange(3, test_max, 2) with xrange(2, test_max)

    # All multiples of 2 are not prime numbers, except for 2 itself:
    sieve[4::2] = (limit / 2 - 1) * [0]

    for prime in xrange(3, test_max, 2):
        if sieve[prime]:
            # If the number hasn't been marked a composite, it is prime.
            # For each prime number, we now mark its multiples as non-prime.
            multiples_count = limit / prime - (prime - 1)
            sieve[prime * prime::prime] = multiples_count * [0]
    # Filter out all the zeroes, we are left with only the primes
    return filter(None, sieve)

Code Snippets

def sieve_of_erastothenes(limit):
    """Create a list of all numbers up to `number`, mark all non-primes 0,
    leave all the primes in place. Filter out non-primes in the last step.
    """
    sieve = range(limit + 1)

    # There are no prime factors larger than sqrt(number):
    test_max = int(math.ceil(limit ** 0.5))

    # We know 1 is not a prime factor
    sieve[1] = 0

    # The next is a cheat which allows us to not even consider even numbers
    # in the main loop of the program. You can leave this out and replace the
    # current xrange(3, test_max, 2) with xrange(2, test_max)

    # All multiples of 2 are not prime numbers, except for 2 itself:
    sieve[4::2] = (limit / 2 - 1) * [0]

    for prime in xrange(3, test_max, 2):
        if sieve[prime]:
            # If the number hasn't been marked a composite, it is prime.
            # For each prime number, we now mark its multiples as non-prime.
            multiples_count = limit / prime - (prime - 1)
            sieve[prime * prime::prime] = multiples_count * [0]
    # Filter out all the zeroes, we are left with only the primes
    return filter(None, sieve)

Context

StackExchange Code Review Q#15777, answer score: 7

Revisions (0)

No revisions yet.