HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

TSP Brute Force Optimization in Python

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
tspforceoptimizationpythonbrute

Problem

I am currently working on a Python 3.4.3 project which includes solving the Traveling Salesman Problem using different algorithms. I have just written a brute force algorithm and I would love some feedback. Now I understand that the number of operations increases by the factorial of the route length so brute forcing is not considered feasible, but for my project I need a standard brute force.

My overarching question is: How can I make this as fast and efficient as possible?

```
import itertools

def distance(p1, p2):
#calculates distance from two points
d = (((p2[0] - p1[0])2) + ((p2[1] - p1[1])2))**.5
return int(d)

def calCosts(routes, nodes):
travelCosts = []

for route in routes:
travelCost = 0

#Sums up the travel cost
for i in range(1,len(route)):
#takes an element of route, uses it to find the corresponding coords and calculates the distance
travelCost += distance(nodes[str(route[i-1])], nodes[str(route[i])])

travelCosts.append(travelCost)

#pulls out the smallest travel cost
smallestCost = min(travelCosts)
shortest = (routes[travelCosts.index(smallestCost)], smallestCost)

#returns tuple of the route and its cost
return shortest

def genRoutes(routeLength):
#lang hold all the 'alphabet' of nodes
lang = [ x for x in range(2,routeLength+1) ]

#uses built-in itertools to generate permutations
routes = list(map(list, itertools.permutations(lang)))
#inserts the home city, must be the first city in every route
for x in routes:
x.insert(0,1)
return routes

def main(nodes=None, instanceSize=5):
#nodes and instanceSize are passed into main() using another program
#I just gave them default values for this example

#The Node lookup table.
Nodes = {
'1': (565.0, 575.0),
'2': (25.0, 185.0),
'3': (345.0, 750.0),
'4': (945.0, 685.0),
'5': (845.0, 655.0),
'6': (8

Solution

Lets have a little style review, before some code refactoring, and finish off with some performance comparison.

Style and code review

I suggest you read up on PEP8, which is the official style guide for Python.

  • Not using snake_case for variable and function names - You are using mostly camelCase, and in some cases one letter variables



  • Good use of singular vs plural – I like combinations like for route in routes. They make sense



  • Good use of if __name__ ... - This is good construct



  • Many temporary variables – You have a little too many temporary variables, which are only used once. These can often be skipped



  • Avoid using indexes in loops – In general, it is better to avoid using indexes in loops, rather than looping on the elements themself. Especially bad is it when you use str(i) to get the index, when the Nodes list should have used pure integers directly as keys.



  • Avoid building large lists in memory – Your code builds the routes list in memory. When calling this for a route length of 9, your original code uses approx 6.7MB to store the routes. My optimised routine below, uses 0.02MB because it uses generators instead...



  • Avoid recalculating distances – Python can memoize the output from a function for each given input by using decorators. This means that you can calculate the distance between two points once, and the next time you need it the memoize function picks your previous calculation of the result.



  • Use docstrings instead of ordinary comment to document functions – If you use docstrings, some of the editors will provide interactive help when you use your functions, and it is according the style guide.



Performance comparison

I made some variants of the code, both of your original, the one by SuperBiasedMan (with some correction as his code currently has a bug), some versions using only generators, itertools, sum and min, and finally a version using for loops instead of list comprehension and a slightly alternate optimisation to find the minimum.

I used the following version of memoize, which most likely can be further optimised:

def memoize(f):
    """ Memoization decorator for functions taking one or more arguments. """
    class memodict(dict):
        def __init__(self, f):
            self.f = f
        def __call__(self, *args):
            return self[args]
        def __missing__(self, key):
            ret = self[key] = self.f(*key)
            return ret
    return memodict(f)

@memoize
def distance(p1, p2):
    """Calculates distance between two points, memoizes result"""
    d = (((p2[0] - p1[0])**2) + ((p2[1] - p1[1])**2)) **.5
    return int(d)


One version taking it to the extreme regarding list comprehension and generators are this one:

def route_generator(route_length):
    """Generate all possible routes of length route_length, starting at 1."""
    for route in permutations(xrange(2, route_length+1)):
        yield (1, ) + route

def main_holroy_v2(instance_size=INSTANCE_SIZE):
    print min((sum(distance(NODES[start], NODES[stop])
                   for start, stop in pairwise(route)), route)
                       for route in route_generator(instance_size))


I made the Nodes into a global dict with integers as key, named NODES. After some testing I found that the extreme variant was slower than expected, so I fumbled around and found this to be the fastest in my environment:

def find_shortest_route3(nodes, route_length):
    """Find shortest route of length route_length from nodes."""   

    minimum_distance = None
    for route in permutations(xrange(2, route_length+1)):
        current_distance = 0

        prev = nodes[1]
        for next in route:
            current_distance += distance(prev, nodes[next])
            prev = nodes[next]
            if minimum_distance and current_distance > minimum_distance:
                break
        else:
            if not minimum_distance or current_distance < minimum_distance:
                minimum_distance = current_distance
                minimum_route = route

    return minimum_distance, minimum_route

def main_minimum3(instance_size=INSTANCE_SIZE):
    distance.clear()

    cost, route = find_shortest_route3(NODES, instance_size)
    print('Shortest route: {}'.format((1, ) + route))
    print('Travel cost   : {}'.format(cost))


Some comments on this code:

  • It turns out that list comprehensions are slightly slower than plain for loops in this case



  • Instead of completing the sum for a route, I break out of the inner loop if the sum has crossed an earlier minimum. The current route is then not interesting any more



  • The actual route returned does not include the starting point. This can easily be added, as done in the print statement. This eliminates the need for either adding in the starting point to all routes, or to remove all routes not starting at the correct point. In short, it is more efficient



  • If the inner for loop completes ordina

Code Snippets

def memoize(f):
    """ Memoization decorator for functions taking one or more arguments. """
    class memodict(dict):
        def __init__(self, f):
            self.f = f
        def __call__(self, *args):
            return self[args]
        def __missing__(self, key):
            ret = self[key] = self.f(*key)
            return ret
    return memodict(f)

@memoize
def distance(p1, p2):
    """Calculates distance between two points, memoizes result"""
    d = (((p2[0] - p1[0])**2) + ((p2[1] - p1[1])**2)) **.5
    return int(d)
def route_generator(route_length):
    """Generate all possible routes of length route_length, starting at 1."""
    for route in permutations(xrange(2, route_length+1)):
        yield (1, ) + route

def main_holroy_v2(instance_size=INSTANCE_SIZE):
    print min((sum(distance(NODES[start], NODES[stop])
                   for start, stop in pairwise(route)), route)
                       for route in route_generator(instance_size))
def find_shortest_route3(nodes, route_length):
    """Find shortest route of length route_length from nodes."""   

    minimum_distance = None
    for route in permutations(xrange(2, route_length+1)):
        current_distance = 0

        prev = nodes[1]
        for next in route:
            current_distance += distance(prev, nodes[next])
            prev = nodes[next]
            if minimum_distance and current_distance > minimum_distance:
                break
        else:
            if not minimum_distance or current_distance < minimum_distance:
                minimum_distance = current_distance
                minimum_route = route

    return minimum_distance, minimum_route

def main_minimum3(instance_size=INSTANCE_SIZE):
    distance.clear()

    cost, route = find_shortest_route3(NODES, instance_size)
    print('Shortest route: {}'.format((1, ) + route))
    print('Travel cost   : {}'.format(cost))

Context

StackExchange Code Review Q#110221, answer score: 4

Revisions (0)

No revisions yet.