HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonModerate

Breadth First Search to find connected primes

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
searchprimesbreadthfirstconnectedfind

Problem

I've been working through the British Informatic Olympiad's 2016 Paper, and I've been struggling with some of the larger cases for question 3, specifically the larger cases on the mark scheme (the 3 before last specifically).



  • Two prime numbers are connected if the difference between them is 2n for some whole number n ≥ 0.



  • A path is a sequence of (at least two) prime numbers, without repetition, where adjacent numbers in the sequence are connected. If


the first number in the sequence is p and the last number is q
then we say the path is between p and q.

  • The length of a path is the total number of prime numbers used. There may be multiple paths between two prime numbers; the lengths of


these paths may be different.



Write a program to determine the length of the shortest path between
two primes. Your program should accept three integers in order: l (4
≤ l ≤ 224) indicating the highest value you are allowed to use,
followed by the primes p then q (2 ≤ p < q < l). You will only
be given input where there is a path between p and q using values
below l. You should output the length of the shortest path.

Inputs I was struggling with:

614700 3643 90149 | outputs 18
987654 3643 90149 | outputs 16
1000000 2 968137  | outputs 18


How can I improve its performance?

```
import math

primes = None
upper_limit = 0
powers_of_two = []

class Node:
def __init__(self, value, parent):
self.value = value
self.parent = parent
def get_adjacent(self):
for i in powers_of_two:
if (self.value + i = 2
and self.parent != self.value - i
and self.value - i in primes):
yield Node(self.value - i, self)

def sieve(n):
numbers = list(range(0, n))
for prime in numbers:
if prime n ** 0.5:
break
for i in range(prime ** 2, n, prime):
numbers[i] = 0
return [x for x in numbers if x > 1]

def bfs(f, to):
f = Node(f

Solution

There are 3 major performance problems you currently have:

  • the primes is currently a list and every time you do something in primes - this is O(n) average-time complexity (where n is the length of primes) - define primes as a set instead



  • the visited is a list, same O(n) lookups - define it as a set instead



  • the del q[0] is also a O(n) operation - removing an element from the beginning of a list requires rearranging all following elements - use a collections.deque in place of a list for O(1) deletes from both ends (.popleft() for the left-end deletion)



Revised version (works for me in an instant for all the test inputs):

import math
from collections import deque

class Node:
    __slots__ = ['value', 'parent']

    def __init__(self, value, parent):
        self.value = value
        self.parent = parent

    def get_adjacent(self, primes):
        for i in powers_of_two:
            if self.value + i = 2 and self.parent != self.value - i and self.value - i in primes:
                yield Node(self.value - i, self)

def sieve(n):
    numbers = list(range(0, n))
    for prime in numbers:
        if prime  n ** 0.5:
            break

        for i in range(prime ** 2, n, prime):
            numbers[i] = 0

    return {x for x in numbers if x > 1}

def bfs(f, to, primes):
    f = Node(f, None)

    q = deque([f])
    visited = set([f.value])
    while q:
        if q[0].value == to:
            return q[0]
        else:
            for x in q[0].get_adjacent(primes):
                # print(q[0].value, x.value)
                if x.value not in visited:
                    # print("added", x.value)
                    visited.add(x.value)
                    q.append(x)
        q.popleft()

if __name__ == '__main__':
    upper_limit, p, q = [int(x) for x in input().split(' ')]

    m = int(math.ceil(math.log2(upper_limit))) if math.log2(upper_limit) % 1 != 0 else int(math.log2(upper_limit))
    powers_of_two = [2 ** i for i in range(m)]

    primes = sieve(upper_limit)

    x = bfs(p, q, primes)

    counter = 0
    while x is not None:
        print(x.value)
        x = x.parent
        counter += 1

    print("--------------------")
    print(counter)


You can also see some other changes:

  • removed the while True loop (from what I understand you just need a single run)



  • using if __name__ == '__main__':



  • avoiding using globals and passing the primes set to the bfs() and get_adjacent() as an argument



  • using __slots__ for the Node class



  • used list unpacking for reading the input numbers



  • used upper_limit, p and q variables to increase readability (p and q variable names are used in the problem description)

Code Snippets

import math
from collections import deque


class Node:
    __slots__ = ['value', 'parent']

    def __init__(self, value, parent):
        self.value = value
        self.parent = parent

    def get_adjacent(self, primes):
        for i in powers_of_two:
            if self.value + i < upper_limit and self.parent != self.value + i and self.value + i in primes:
                yield Node(i + self.value, self)
            if self.value - i >= 2 and self.parent != self.value - i and self.value - i in primes:
                yield Node(self.value - i, self)


def sieve(n):
    numbers = list(range(0, n))
    for prime in numbers:
        if prime < 2:
            continue
        elif prime > n ** 0.5:
            break

        for i in range(prime ** 2, n, prime):
            numbers[i] = 0

    return {x for x in numbers if x > 1}


def bfs(f, to, primes):
    f = Node(f, None)

    q = deque([f])
    visited = set([f.value])
    while q:
        if q[0].value == to:
            return q[0]
        else:
            for x in q[0].get_adjacent(primes):
                # print(q[0].value, x.value)
                if x.value not in visited:
                    # print("added", x.value)
                    visited.add(x.value)
                    q.append(x)
        q.popleft()


if __name__ == '__main__':
    upper_limit, p, q = [int(x) for x in input().split(' ')]

    m = int(math.ceil(math.log2(upper_limit))) if math.log2(upper_limit) % 1 != 0 else int(math.log2(upper_limit))
    powers_of_two = [2 ** i for i in range(m)]

    primes = sieve(upper_limit)

    x = bfs(p, q, primes)

    counter = 0
    while x is not None:
        print(x.value)
        x = x.parent
        counter += 1

    print("--------------------")
    print(counter)

Context

StackExchange Code Review Q#157626, answer score: 11

Revisions (0)

No revisions yet.