HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Optimise Sieve of Eratosthenes

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
optimisesieveeratosthenes

Problem

I have written the following code for finding out the prime numbers less than or equal to n. When I enter the input n = 10^7, my program crashes. So I was wondering whether my code can be further optimised.

def sieve(n):
    array = []
    for i in range(1, n):
        array.append(True)
    for i in range(0, n-1):
        for j in range(2*i+4, n+1, i+2):
            if array[j-2] != False:
                if j % (i+2) == 0:
                    array[j-2] = False
    final = []
    for i in range(0, n-1):
        if array[i] == True:
            final.append(i+2)
    return final

Solution

First, I would clean up the indices in the inner loop. I think the +2 and -2 are a bit confusing at the first glance.

def sieve(n):
    array = []
    for i in range(0, n + 1):
        array.append(True)
    for i in range(2, n + 1):
        for j in range(2*i, n+1, i):
            if array[j] != False:
                if j % (i) == 0:
                    array[j] = False
    final = []
    for i in range(0, n+1):
        if array[i] == True:
            final.append(i)
    return final


This way, array indices directly correspond to their number, no conversion needed. This wastes array space for 0 and 1, but I think it is a lot easier to understand.

The primary optimization that is possible adjusts the ranges of the loops. Each non-prime
<= n
has a divisor <= sqrt(n) and thus we can limit the outer loop to range(2, math.sqrt(100) + 1)

Additionally we don't actually need to run the inner loop for numbers which are non-prime (since each non-prime has prime divisors) and we can add an if array[i] == True: to reduce the number of inner loops further.

The range of the inner loop can also be reduced. It actually can start at i^2 instead of 2*i since the argument made earlier also applies here. All non-primes smaller than i^2 must have an divisor < i and thus were already set to false in an earlier iteration of the outer loop.

If we apply these changes, we get the following code:

def faster_sieve(n):
    array = []
    for i in range(0, n + 1):
        array.append(True)

    for i in range(2, int(math.sqrt(n)) + 1):
        if array[i] == True:
            for j in range(i*i, n + 1, i):
                if array[j] != False:
                    if j % i == 0:
                        array[j] = False
    final = []
    for i in range(2, n + 1):
        if array[i] == True:
            final.append(i)
    return final


For comparison we can run

faster_sieve(int(math.pow(10, 7)))


and

sieve(int(math.pow(10, 7)))


On my machine:

~ time python faster_sieve.py
python faster_sieve.py  5.44s user 0.04s system 99% cpu 5.477 total

~ time python sieve.py 
python sieve.py  32.97s user 0.03s system 99% cpu 33.003 total


Which is a lot faster!

Now we could to some python microoptimizations, but I don't know anything about python. A first step could be to replace the append-loop with something faster like array = [True] * (n + 1), which saves another second on my machine

~ time python faster_sieve.py
python faster_sieve.py  4.46s user 0.02s system 99% cpu 4.475 total


So, yes indeed, your code could be further optimized.

/e
I might add that there are already good reviews of python code for the sieve on this site. For example this which applies a lot of python optimization.

/e2

Looking at the code again and looking at Wikipedia, the checks inside the inner loop are actually not needed since j is per loop definition a multiple of i and we can simplify it to

for j in range(i*i, n + 1, i):
    array[j] = False


which optimizes the program further:

~ time python faster_sieve.py
python faster_sieve.py  2.61s user 0.01s system 99% cpu 2.617 total


/e3 After reading jerry's response and finding two bugs in his implementation, I think there is another easy optimization that can be added to his approach. Since i is always odd in his inner loop, so is i i. Thus we can increase the increment step of the inner loop from i to i 2 (i * i + i would be even). This result in the following results:

Simon: 3.5496606826782227
Jerry (broken): 1.7031567096710205
Josay: 2.2623558044433594
Krypton: 5.4344189167022705
Krypton2: 2.5819575786590576
Jerry2: 1.396036148071289


Krypton2 and Jerry2 being:

def krypton2(n):
    array = [True] * (n + 1)

    for i in range(2, int(math.sqrt(n)) + 1):
        if array[i] == True:
            for j in range(i*i, n + 1, i):
                array[j] = False
    final = []
    for i in range(2, n + 1):
        if array[i] == True:
            final.append(i)
    return final

def jerry2(n):
    array = [True] * n

    result = []

    result.append(2)
    for i in range(4, n, 2):
        array[i] = False;

    for i in range(3, int(math.sqrt(n) + 1), 2):
        if array[i]:
            for j in range (i*i, n, i * 2):
                array[j] = False;

    for i in range(3, n, 2):
        if array[i]:
            result.append(i)
    return result

Code Snippets

def sieve(n):
    array = []
    for i in range(0, n + 1):
        array.append(True)
    for i in range(2, n + 1):
        for j in range(2*i, n+1, i):
            if array[j] != False:
                if j % (i) == 0:
                    array[j] = False
    final = []
    for i in range(0, n+1):
        if array[i] == True:
            final.append(i)
    return final
def faster_sieve(n):
    array = []
    for i in range(0, n + 1):
        array.append(True)

    for i in range(2, int(math.sqrt(n)) + 1):
        if array[i] == True:
            for j in range(i*i, n + 1, i):
                if array[j] != False:
                    if j % i == 0:
                        array[j] = False
    final = []
    for i in range(2, n + 1):
        if array[i] == True:
            final.append(i)
    return final
faster_sieve(int(math.pow(10, 7)))
sieve(int(math.pow(10, 7)))
~ time python faster_sieve.py
python faster_sieve.py  5.44s user 0.04s system 99% cpu 5.477 total

~ time python sieve.py 
python sieve.py  32.97s user 0.03s system 99% cpu 33.003 total

Context

StackExchange Code Review Q#143315, answer score: 9

Revisions (0)

No revisions yet.