HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Project Euler #35 - Circular primes solution is slow

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
primesslowprojectcirculareulersolution

Problem

This is my code for project euler #35 (in Python):

The number, 197, is called a circular prime because all rotations of the digits: 197, 971, and 719, are themselves prime.

There are thirteen such primes below 100: 2, 3, 5, 7, 11, 13, 17, 31, 37, 71, 73, 79, and 97.

How many circular primes are there below one million?

The program takes about 2 minutes to run, and I don't really know how to improve the efficiency of it. I'm using a sieve to generate the primes, and I'm only checking rotations until the rotation isn't prime.

def primes_sieve(limit):
    limitn = limit+1
    not_prime = [False] * limitn
    primes = []

    for i in range(2, limitn):
        if not_prime[i]:
            continue
        for f in range(i*2, limitn, i):
            not_prime[f] = True

        primes.append(i)

    return primes

primes = set(primes_sieve(1000000));
exclude = ['2','4','5','6','8','0']
def circularPrime(n):
    ss = str(n);
    for i in range(n):
        if ss[-1] in exclude:
            return 0;
        if int(ss) not in primes:
            return 0;
        ss = ss[-1] + ss[:-1];
    return 1;

gg = 0;
for num in primes:
    gg += circularPrime(num);
print gg;


Any help is greatly appreciated.

Solution

BUG

Your program gives the wrong answer, yielding 53 instead of 55. The reason for this is you over-eagerly exclude rotations in circularPrime if anything ends in an even digit or 5. But 2 and 5 are prime, yet end in a digit that you exclude!

It turns out that the exclusion check (in addition to giving you the wrong answer) doesn't actually help you anyway. Timing your function run ten times:

with check:    6.094s
w/o check:     6.075s


So drop it, and you get the correct answer, and it runs faster (or the same, within noise).

Semicolons

Semicolons are a C/C++/Java/etc. thing. While they're technically correct Python syntax, you don't need them, and they look weird, so you can simply drop them.

Sieve

Typically, we'd start our sieve with something like is_prime = [True] * limitn. It's a little weird to see it backwards, but I guess that's ok.

More efficient sieving

Let's take a look at the sieve part of your code, swapping bools like I suggested:

for i in range(2, limitn):
    if not is_prime[i]:
        continue
    for f in range(i*2, limitn, i):
        is_prime[f] = False

    primes.append(i)


First of all, having the continue makes it harder to read - better to keep the flow mono-directional:

for i in range(2, limitn):
    if is_prime[i]:
        primes.append(i)
        for f in range(i*2, limitn, i):
            is_prime[f] = False


Now, there's two issues here. In Python 2.7, range() gives you a list. The full list. That's hugely wasteful since you never want the full list, only the next element one at a time. For that, there's xrange(). Just swapping out those two functions:

range()    6.075s
xrange()   4.913s


Now, let's take a look at your inner loop:

for f in xrange(i*2, limitn, i):
    is_prime[f] = False


f is a bad variable name for this, prefer something like j, but that's not important. Consider the bounds. At this point, we know that \$i\$ is prime. We also know that this is the first time we've gotten to it, so every multiple of \$i\$ is still marked as being potentially prime, except for those divisible by some other prime \$p < i\$. So \$2i\$ we already know isn't prime, because it's divisible by \$2\$. \$3i\$ likewise. And so on. The first composite number that we have to cross off will be the first one not divisible by any such prime \$p\$... which is \$i^2\$.

for j in xrange(i*i, limitn, i):
    is_prime[j] = False


That improves us to:

OP, w/ bugfix         6.075s
range() -> xrange()   4.913s
inner loop from i^2   4.343s


Bring back the bug

I love deliberately provocative headings. Anyway, you had the right idea of checking for the even digits, but we can be more aggressive with it. Before we check any of the rotations, let's check for the presence of any of those digits first:

def circularPrime(n):
    ss = str(n)
    if any(c in ss for c in '024568'):
        return 0

    # now check everything else


This way, we can avoid a whole bunch of work for all of our 6-digit primes that start with an even number. However, from the very first thing I wrote, we know that this will give us the wrong answer. However, we know how to fix it. We know we are erroneously excluding 2 and 5, so let's just add them back:

gg = 2 # because 2,5
for num in primes:
    gg += circularPrime(num)
return gg


That drops us down to:

OP, w/ bugfix         6.075s
range() -> xrange()   4.913s
inner loop from i^2   4.343s
excluding again       3.721s


I had originally said your check didn't give you any performance benefit, but this one does - since building up all the string rotations and doing all the int casting is expensive and in this way we're able to avoid all of that up front. Since the number of circular primes is sparse (only 55 out of ~75k primes!), anything we can do to aggressively weed our set down is good.

Code Snippets

with check:    6.094s
w/o check:     6.075s
for i in range(2, limitn):
    if not is_prime[i]:
        continue
    for f in range(i*2, limitn, i):
        is_prime[f] = False

    primes.append(i)
for i in range(2, limitn):
    if is_prime[i]:
        primes.append(i)
        for f in range(i*2, limitn, i):
            is_prime[f] = False
range()    6.075s
xrange()   4.913s
for f in xrange(i*2, limitn, i):
    is_prime[f] = False

Context

StackExchange Code Review Q#110910, answer score: 7

Revisions (0)

No revisions yet.