patternpythonMinor
TSP Brute Force Optimization in Python
Viewed 0 times
tspforceoptimizationpythonbrute
Problem
I am currently working on a Python 3.4.3 project which includes solving the Traveling Salesman Problem using different algorithms. I have just written a brute force algorithm and I would love some feedback. Now I understand that the number of operations increases by the factorial of the route length so brute forcing is not considered feasible, but for my project I need a standard brute force.
My overarching question is: How can I make this as fast and efficient as possible?
```
import itertools
def distance(p1, p2):
#calculates distance from two points
d = (((p2[0] - p1[0])2) + ((p2[1] - p1[1])2))**.5
return int(d)
def calCosts(routes, nodes):
travelCosts = []
for route in routes:
travelCost = 0
#Sums up the travel cost
for i in range(1,len(route)):
#takes an element of route, uses it to find the corresponding coords and calculates the distance
travelCost += distance(nodes[str(route[i-1])], nodes[str(route[i])])
travelCosts.append(travelCost)
#pulls out the smallest travel cost
smallestCost = min(travelCosts)
shortest = (routes[travelCosts.index(smallestCost)], smallestCost)
#returns tuple of the route and its cost
return shortest
def genRoutes(routeLength):
#lang hold all the 'alphabet' of nodes
lang = [ x for x in range(2,routeLength+1) ]
#uses built-in itertools to generate permutations
routes = list(map(list, itertools.permutations(lang)))
#inserts the home city, must be the first city in every route
for x in routes:
x.insert(0,1)
return routes
def main(nodes=None, instanceSize=5):
#nodes and instanceSize are passed into main() using another program
#I just gave them default values for this example
#The Node lookup table.
Nodes = {
'1': (565.0, 575.0),
'2': (25.0, 185.0),
'3': (345.0, 750.0),
'4': (945.0, 685.0),
'5': (845.0, 655.0),
'6': (8
My overarching question is: How can I make this as fast and efficient as possible?
```
import itertools
def distance(p1, p2):
#calculates distance from two points
d = (((p2[0] - p1[0])2) + ((p2[1] - p1[1])2))**.5
return int(d)
def calCosts(routes, nodes):
travelCosts = []
for route in routes:
travelCost = 0
#Sums up the travel cost
for i in range(1,len(route)):
#takes an element of route, uses it to find the corresponding coords and calculates the distance
travelCost += distance(nodes[str(route[i-1])], nodes[str(route[i])])
travelCosts.append(travelCost)
#pulls out the smallest travel cost
smallestCost = min(travelCosts)
shortest = (routes[travelCosts.index(smallestCost)], smallestCost)
#returns tuple of the route and its cost
return shortest
def genRoutes(routeLength):
#lang hold all the 'alphabet' of nodes
lang = [ x for x in range(2,routeLength+1) ]
#uses built-in itertools to generate permutations
routes = list(map(list, itertools.permutations(lang)))
#inserts the home city, must be the first city in every route
for x in routes:
x.insert(0,1)
return routes
def main(nodes=None, instanceSize=5):
#nodes and instanceSize are passed into main() using another program
#I just gave them default values for this example
#The Node lookup table.
Nodes = {
'1': (565.0, 575.0),
'2': (25.0, 185.0),
'3': (345.0, 750.0),
'4': (945.0, 685.0),
'5': (845.0, 655.0),
'6': (8
Solution
Lets have a little style review, before some code refactoring, and finish off with some performance comparison.
Style and code review
I suggest you read up on PEP8, which is the official style guide for Python.
Performance comparison
I made some variants of the code, both of your original, the one by SuperBiasedMan (with some correction as his code currently has a bug), some versions using only generators, itertools,
I used the following version of memoize, which most likely can be further optimised:
One version taking it to the extreme regarding list comprehension and generators are this one:
I made the
Some comments on this code:
Style and code review
I suggest you read up on PEP8, which is the official style guide for Python.
- Not using
snake_casefor variable and function names - You are using mostlycamelCase, and in some cases one letter variables
- Good use of singular vs plural – I like combinations like
for route in routes. They make sense
- Good use of
if __name__ ...- This is good construct
- Many temporary variables – You have a little too many temporary variables, which are only used once. These can often be skipped
- Avoid using indexes in loops – In general, it is better to avoid using indexes in loops, rather than looping on the elements themself. Especially bad is it when you use
str(i)to get the index, when theNodeslist should have used pure integers directly as keys.
- Avoid building large lists in memory – Your code builds the
routeslist in memory. When calling this for a route length of 9, your original code uses approx 6.7MB to store the routes. My optimised routine below, uses 0.02MB because it uses generators instead...
- Avoid recalculating distances – Python can memoize the output from a function for each given input by using decorators. This means that you can calculate the distance between two points once, and the next time you need it the memoize function picks your previous calculation of the result.
- Use docstrings instead of ordinary comment to document functions – If you use docstrings, some of the editors will provide interactive help when you use your functions, and it is according the style guide.
Performance comparison
I made some variants of the code, both of your original, the one by SuperBiasedMan (with some correction as his code currently has a bug), some versions using only generators, itertools,
sum and min, and finally a version using for loops instead of list comprehension and a slightly alternate optimisation to find the minimum.I used the following version of memoize, which most likely can be further optimised:
def memoize(f):
""" Memoization decorator for functions taking one or more arguments. """
class memodict(dict):
def __init__(self, f):
self.f = f
def __call__(self, *args):
return self[args]
def __missing__(self, key):
ret = self[key] = self.f(*key)
return ret
return memodict(f)
@memoize
def distance(p1, p2):
"""Calculates distance between two points, memoizes result"""
d = (((p2[0] - p1[0])**2) + ((p2[1] - p1[1])**2)) **.5
return int(d)One version taking it to the extreme regarding list comprehension and generators are this one:
def route_generator(route_length):
"""Generate all possible routes of length route_length, starting at 1."""
for route in permutations(xrange(2, route_length+1)):
yield (1, ) + route
def main_holroy_v2(instance_size=INSTANCE_SIZE):
print min((sum(distance(NODES[start], NODES[stop])
for start, stop in pairwise(route)), route)
for route in route_generator(instance_size))I made the
Nodes into a global dict with integers as key, named NODES. After some testing I found that the extreme variant was slower than expected, so I fumbled around and found this to be the fastest in my environment:def find_shortest_route3(nodes, route_length):
"""Find shortest route of length route_length from nodes."""
minimum_distance = None
for route in permutations(xrange(2, route_length+1)):
current_distance = 0
prev = nodes[1]
for next in route:
current_distance += distance(prev, nodes[next])
prev = nodes[next]
if minimum_distance and current_distance > minimum_distance:
break
else:
if not minimum_distance or current_distance < minimum_distance:
minimum_distance = current_distance
minimum_route = route
return minimum_distance, minimum_route
def main_minimum3(instance_size=INSTANCE_SIZE):
distance.clear()
cost, route = find_shortest_route3(NODES, instance_size)
print('Shortest route: {}'.format((1, ) + route))
print('Travel cost : {}'.format(cost))Some comments on this code:
- It turns out that list comprehensions are slightly slower than plain for loops in this case
- Instead of completing the sum for a route, I break out of the inner loop if the sum has crossed an earlier minimum. The current route is then not interesting any more
- The actual route returned does not include the starting point. This can easily be added, as done in the
printstatement. This eliminates the need for either adding in the starting point to all routes, or to remove all routes not starting at the correct point. In short, it is more efficient
- If the inner
forloop completes ordina
Code Snippets
def memoize(f):
""" Memoization decorator for functions taking one or more arguments. """
class memodict(dict):
def __init__(self, f):
self.f = f
def __call__(self, *args):
return self[args]
def __missing__(self, key):
ret = self[key] = self.f(*key)
return ret
return memodict(f)
@memoize
def distance(p1, p2):
"""Calculates distance between two points, memoizes result"""
d = (((p2[0] - p1[0])**2) + ((p2[1] - p1[1])**2)) **.5
return int(d)def route_generator(route_length):
"""Generate all possible routes of length route_length, starting at 1."""
for route in permutations(xrange(2, route_length+1)):
yield (1, ) + route
def main_holroy_v2(instance_size=INSTANCE_SIZE):
print min((sum(distance(NODES[start], NODES[stop])
for start, stop in pairwise(route)), route)
for route in route_generator(instance_size))def find_shortest_route3(nodes, route_length):
"""Find shortest route of length route_length from nodes."""
minimum_distance = None
for route in permutations(xrange(2, route_length+1)):
current_distance = 0
prev = nodes[1]
for next in route:
current_distance += distance(prev, nodes[next])
prev = nodes[next]
if minimum_distance and current_distance > minimum_distance:
break
else:
if not minimum_distance or current_distance < minimum_distance:
minimum_distance = current_distance
minimum_route = route
return minimum_distance, minimum_route
def main_minimum3(instance_size=INSTANCE_SIZE):
distance.clear()
cost, route = find_shortest_route3(NODES, instance_size)
print('Shortest route: {}'.format((1, ) + route))
print('Travel cost : {}'.format(cost))Context
StackExchange Code Review Q#110221, answer score: 4
Revisions (0)
No revisions yet.