HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Computing the n-th prime

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
primethecomputing

Problem

I've written some Python 3 code that computes the \$n\$-th prime. I implemented first a naive isprime function that looks for divisors of \$m\$ between \$2\$ and \$\lfloor \sqrt m \rfloor+1\$. Then a loop looks for primes and stops when the \$n\$th one is found.

from math import sqrt
def isprime(n):
    for i in range(2,int(sqrt(n))+1):
        if n%i==0:
            return False
    return True

def prime(n):
    m=3
    i=2
    ans=3
    while i<=n:
        if isprime(m):
            i=i+1

            ans=m
            m=m+2
        else:
            m=m+2
    return ans


It occured to me that prime performs a lot of unnecessary computations: for a given \$m\$, it checks if composite numbers (like 14,16) divide \$m\$. That is useless, and it would be more efficient to look only for prime divisors of \$m\$. This led me to some "storage" approach, where I maintain a list of all the primes I've found, and use them to test for divisors of the next numbers.

from math import sqrt
def prime(n):
    list=[2]
    i=1
    m=3
    while i<n:
        flag=0
        for p in list:
            if m%p==0:
                flag=1
                break
            else:
                continue
        if flag==0:

            list.append(m)
            m=m+2
            i=i+1
        else:
            m=m+2
    return list


The \$n\$th prime is given by prime(n)[-1]

I have an issue with the performance of the second code: it's really slow.
On my computer, according to the Unix command time python code.py, computing the \$6000\$-th prime with the first code takes \$0.231\$ seconds, and \$2.799\$ seconds with the other approach!

Why is the clever way slower than the naive one?

Solution

First, you should read the Python style guide it has a lot of info on good practices for code readability.

But as for your code, part of the reason the smarter way is slower is that you're creating the whole list every time. It's not stored at all when you create it and this means that instead you're just taking extra time because it has to go through the effort of building the list every time even though you only need the last value. Also, you're not limiting what primes you test for division, so you've lost the root optimisation you used before.

If you cached this list as part of the function then you actually would have time saved on repeated calls, though it would still be slower initially due to building the initial list. There is a way to set up a cache pretty easily, using default parameters. If you're unfamiliar with them, leave a comment and I can explain more about how they're generally used, though I'm using them in a different way than normal here.

If you change def prime(n) to def prime(n, cache=[2]) then you'll have a list called cache created containing 2, the first prime. Each time you call prime it refers back to the same list, so if you append your primes to it, it will be retained and save you all that work.

First of all, this means you might not need to calculate anything when you start. You can just try to return your prime value immediately:

def prime(n, cache=[2]):
    try:
        return cache[n - 1]
    except IndexError:
        pass


To explain, try except is used to attempt a piece of code that may generate an error. So it tries to return the nth prime (use n-1 because lists are indexed from 0) but if it can't find that value in the list it will throw an IndexError. Our except will catch this and just pass, meaning it goes on to run the rest of the function.

The other benefit is you can start your while loop from the last point in that list. So m (which I'd rename as num) can just be the last prime cached + 1 and i can be set as len(cache), since that's the amount of previous primes we're skipping.

i = len(cache)
num = cache[-1] + 1


Now you can begin your while loop, skipping the previously found values but still having them as your divisor check. Note also, you can collapse this with some other syntax and smarter logic. Python has else syntax for a for loop. If you never call break from a for loop then the else will be run, but otherwise it will be skipped. With this in mind you can remove multiple lines where you had to use flag before. Also if you're calling m = m + 2 in both cases, it should just be at the end for both conditions. And lastly, you can shorten var = var + 1 to var += 1 in Python, so I did that too.

while i < n:
    for prime in cache:
        if num % prime == 0:
            break
    else:
        cache.append(num)
        i += 1
    num += 2


Here's how the fully rewritten code would look:

def prime(n, cache=[2]):

    try:
        return cache[n - 1]
    except IndexError:
        pass

    i = len(cache)
    num = cache[-1] + 1

    while i < n:
        for prime in cache:
            if num % prime == 0:
                break
        else:
            cache.append(num)
            i += 1
        num += 2

    return cache[-1]

Code Snippets

def prime(n, cache=[2]):
    try:
        return cache[n - 1]
    except IndexError:
        pass
i = len(cache)
num = cache[-1] + 1
while i < n:
    for prime in cache:
        if num % prime == 0:
            break
    else:
        cache.append(num)
        i += 1
    num += 2
def prime(n, cache=[2]):

    try:
        return cache[n - 1]
    except IndexError:
        pass

    i = len(cache)
    num = cache[-1] + 1

    while i < n:
        for prime in cache:
            if num % prime == 0:
                break
        else:
            cache.append(num)
            i += 1
        num += 2

    return cache[-1]

Context

StackExchange Code Review Q#104511, answer score: 6

Revisions (0)

No revisions yet.