patternpythonMinor
Computation of product of permutations
Viewed 0 times
productcomputationpermutations
Problem
Given a graph \$G = (V,E)\$ with a unique labeling of each vertex, let the transposition \$(i,j)\$ (where \$i,j\$ are the labels on adjacent vertices) represent selecting an edge and swapping the labels \$i\$ and \$j\$.
I would like to take a list of edges in a graph (a cube in this code), and determine the number of permutations on \$V\$ that are representable as a product of at most some \$k\$ sets of disjoint transpositions of the above form. For example, a permutation \$(62154703)\$ can be expressed as \$[(67)(03)][(37)(56)](45)(26)\$, where the brackets are only to emphasize the specific disjoint set.
How can I improve my run time? Currently my runtime is about 18 hours. I don't know the runtime of the last step, but currently the first step in CreateAllPermutations takes 2 seconds, and the second step takes 1765 seconds.
Additional question: is the Permutation class from sympy.combinatorics slowing me down?
Here is the input file, and some truncated expected sample output (this output will not be printed, but it is what is being generated) from
Input file:
Sample output from
Main code:
`import time, itertools
from sympy.combinatorics import Permutation as Perm
def GetEdges():
edgeFile = open("EdgesQ3.txt","r")
edges = []
for line in edgeFile:
line = line.strip()
line = list(line)
for vertex in line:
line[line.index(vertex)] = int(vertex)
edges.append(
I would like to take a list of edges in a graph (a cube in this code), and determine the number of permutations on \$V\$ that are representable as a product of at most some \$k\$ sets of disjoint transpositions of the above form. For example, a permutation \$(62154703)\$ can be expressed as \$[(67)(03)][(37)(56)](45)(26)\$, where the brackets are only to emphasize the specific disjoint set.
How can I improve my run time? Currently my runtime is about 18 hours. I don't know the runtime of the last step, but currently the first step in CreateAllPermutations takes 2 seconds, and the second step takes 1765 seconds.
Additional question: is the Permutation class from sympy.combinatorics slowing me down?
Here is the input file, and some truncated expected sample output (this output will not be printed, but it is what is being generated) from
GetSteps(), which is a smaller case of CreateAllPermutations(), along with what I expect to see from print(len(allPerms)):Input file:
01
03
04
12
15
23
26
37
45
47
56
67
Sample output from
GetSteps() and allPerms:Completed all permutations of 1 step! There are 107 such permutations.
(4 7)
(0 1)(2 6)(3 7)
(0 3)(4 7)
(7)(0 4)(1 5)
(7)(0 3)(1 5)(2 6)
(7)(2 3)(5 6)
(7)(1 5)(2 3)
(0 4)(1 5)(2 3)(6 7)
(7)(2 3)(4 5)
(0 1)(3 7)(5 6)
(7)(0 3)
(1 5)(4 7)
(6 7)
(0 4)(1 2)(3 7)
(0 3)(1 5)(4 7)
40320 #8!, as I am expecting to generate every permutation on 8 elements
Main code:
`import time, itertools
from sympy.combinatorics import Permutation as Perm
def GetEdges():
edgeFile = open("EdgesQ3.txt","r")
edges = []
for line in edgeFile:
line = line.strip()
line = list(line)
for vertex in line:
line[line.index(vertex)] = int(vertex)
edges.append(
Solution
Fixing the bottleneck
Profiling with
Armed with that and a hunch I made the simple change of
in place of
and (once I'd modified the debug output to be less frequent) the entire program ran to execution in 75 seconds. It's still painfully slow compared to my C# implementation using a handrolled permutation class (1.5 seconds), but it's a massive improvement.
Run again under the profiler, it takes 105.676 seconds of which the bulk of the time goes to
Of course, the disadvantage would be a loss of readability. On that subject, I have some suggestions.
Readability
I had to think quite hard to work out what this is doing. I'm still not entirely sure why it's working with arrays rather than
This would mean that
you can build up the sets of different sizes in different layers. I think that it would be almost identical to
I don't see the point of
is presumably intended to be a speed optimisation, but I don't think it makes much difference. At each layer you will recover the previous layer anyway because the
Obviously that's a bit hacky, and the magic number should at the very least be an argument.
One possible further optimisation would be a meet-in-the-middle approach. You can find more detailed descriptions of the principle in the context of e.g. Rubik's cube solving, but here's the basic idea. You want to find the least
The approach you're currently using builds up these sequences from one end. But you could equally well build them up from the other end.
That leads to the more complicated (and not tested - beware!) but perhaps more efficient
```
def CreateAllPermutations(oneStep, fullGroup):
targetSize = len(fullGroup)
stepDict = { 1 : oneStep }
for i in range(1,4):
routing = set(stepDict[i])
if len(stepDict[i]) * 2 < targetSize:
for p1 in stepDict[1]:
for p2 in stepDict[i]:
routing.add(p1 * p2)
if
Profiling with
python -m cProfile perm.py and an early abort showed that permutation.py from sympy was accounting for less than 3% of the runtime. The big bottleneck was in CreateAllPermutations itself.Armed with that and a hunch I made the simple change of
routing.add(p)in place of
routing = routing | {p}and (once I'd modified the debug output to be less frequent) the entire program ran to execution in 75 seconds. It's still painfully slow compared to my C# implementation using a handrolled permutation class (1.5 seconds), but it's a massive improvement.
Run again under the profiler, it takes 105.676 seconds of which the bulk of the time goes to
{method 'add' of 'set' objects} (57.537s), basic.py(__eq__) (37.974s), and permutations.py(__mul__) (36.705s). So further speedups might be possible by hand-rolling a permutation class, but the real possibilities lie in changing the set representation. Since you're working over a very finite universe, a bit set might be a possibility.Of course, the disadvantage would be a loss of readability. On that subject, I have some suggestions.
Readability
def AreDisjoint(s1,s2):
v1 = {point for edge in s1 for point in edge}
v2 = {point for edge in s2 for point in edge}
shared = v1 & v2
return shared == set()I had to think quite hard to work out what this is doing. I'm still not entirely sure why it's working with arrays rather than
Perm objects, given that you're using those elsewhere. How about something along these lines?def AreDisjoint(p1,p2):
"""Tests whether two permutations avoid both moving the same element"""
v1 = set(p1.support())
v2 = set(p2.support())
return len(v1 & v2) == 0This would mean that
GetSteps would need to be reworked to use Perm objects, so it might as well be refactored into CreateOneStepPermutations. There's an additional opportunity there to use the same approach as in CreateAllPermutations: instead of sticking everything in one array and having to do tests likeif len(s1) == i:you can build up the sets of different sizes in different layers. I think that it would be almost identical to
CreateAllPermutations except for the addition of the AreDisjoint test. It might even be worthwhile refactoring them into a single method with an argument which is a "can these two permutations be combined?" predicate (passing AreDisjoint while building oneStep, and a method which always returns True while building S_8).I don't see the point of
CreateStepDictionary. In my opinion it would be clearer to make CreateAllPermutations take oneStep and immediately construct stepDict = {1: oneStep}for j in range(1,i+1):
routing = routing - set(stepDict[j])is presumably intended to be a speed optimisation, but I don't think it makes much difference. At each layer you will recover the previous layer anyway because the
oneStep permutations have order 2 by construction, and so are self-inverse. (On timing it to check: removing that step takes us from 75s to 85s, but allows adding in an early return when len(routing) == 40320 which reduces the runtime to 14s).def CreateAllPermutations(oneStep):
stepDict = { 1 : oneStep }
for i in range(1,4):
routing = set()
for p1 in stepDict[1]:
for p2 in stepDict[i]:
routing.add(p1 * p2)
if (len(routing)) == 40320: return routing
stepDict[i+1] = routing
return routingObviously that's a bit hacky, and the magic number should at the very least be an argument.
One possible further optimisation would be a meet-in-the-middle approach. You can find more detailed descriptions of the principle in the context of e.g. Rubik's cube solving, but here's the basic idea. You want to find the least
n such that starting with the generators in oneStep you generate the (finite) group G (here S_8). For the sake of simplicity, let's assume that the identity element e is in oneStep. Then we're looking for n such that for each element u in G there is a sequence g_1 g_2 ... * g_n = u.The approach you're currently using builds up these sequences from one end. But you could equally well build them up from the other end.
g_1 g_2 ... g_n = u can equally well be written g_1 ... g_i = u g_n^{-1} ... g_{i+1}^{-1} (and in this case, since each generator is self-inverse, g_1 ... g_i = u g_n ... * g_{i+1}).That leads to the more complicated (and not tested - beware!) but perhaps more efficient
```
def CreateAllPermutations(oneStep, fullGroup):
targetSize = len(fullGroup)
stepDict = { 1 : oneStep }
for i in range(1,4):
routing = set(stepDict[i])
if len(stepDict[i]) * 2 < targetSize:
for p1 in stepDict[1]:
for p2 in stepDict[i]:
routing.add(p1 * p2)
if
Code Snippets
routing.add(p)routing = routing | {p}def AreDisjoint(s1,s2):
v1 = {point for edge in s1 for point in edge}
v2 = {point for edge in s2 for point in edge}
shared = v1 & v2
return shared == set()def AreDisjoint(p1,p2):
"""Tests whether two permutations avoid both moving the same element"""
v1 = set(p1.support())
v2 = set(p2.support())
return len(v1 & v2) == 0if len(s1) == i:Context
StackExchange Code Review Q#131282, answer score: 6
Revisions (0)
No revisions yet.