snippetpythonMinor
Topological Sort and Graphing in Python 3
Viewed 0 times
graphingtopologicalpythonandsort
Problem
I wrote a program for DFS traversal, topological sort, find all vertices, and has_cycle methods.
Can you please suggest more elegant and eloquent ways for this program?
(Perhaps better ways to represent a graph with vertices and edges?)
NOTE: dfs and has_cycle implementation is inspired by the book "Algorithm Design Manual".
Topological Sort
Find all vertices
```
def find_all(source, edges):
reached = set()
cross = lambda e
Can you please suggest more elegant and eloquent ways for this program?
(Perhaps better ways to represent a graph with vertices and edges?)
NOTE: dfs and has_cycle implementation is inspired by the book "Algorithm Design Manual".
from itertools import dropwhile
class Graph(object):
@staticmethod
def dfs(v, edges, may_enter=None, cross=None, leave=None):
if may_enter is None:
may_enter = lambda v, entered=set(): v not in entered and not entered.add(v)
if may_enter(v):
for e in (edges[v] or []):
cross is not None and cross(e, v)
Graph.dfs(e.y, edges, may_enter, cross, leave)
leave is not None and leave(v)
class Edge(object): # weight is None in unweighted graph
def __init__(self, y=None, weight=None):
self.y, self.weight = y, weight
def __repr__(self):
attrs = (self.weight, self.y)
return "Edge({0})".format(', '.join(reversed( \
[repr(e) for e in dropwhile(lambda e: e is None, attrs)])))Topological Sort
def topological_sort(edges):
sort, entered = [], set()
may_enter = lambda v: v not in entered and not entered.add(v)
leave = lambda v: sort.append(v)
for v in range(len(edges)):
if v not in entered:
Graph.dfs(v, edges, may_enter, leave=leave)
return sort
# graph: D3 ⇾ H7
# ↑
# ┌──────── B1 ⇾ F5
# ↓ ↑ ↑
# J9 ⇽ E4 ⇽ A0 ⇾ C2 ⇾ I8
# ↓
# G6
edges = [[]] * 10
edges[0] = [Edge(1), Edge(2), Edge(4), Edge(6)] # 1, 2, 4, and 6
edges[1] = [Edge(3), Edge(5), Edge(9)] # 3, 5, and 9
edges[2] = [Edge(5), Edge(8)] # 5, 8
edges[3] = [Edge(7)] # 7
edges[4] = [Edge(9)] # 9
assert [7, 3, 5, 9, 1, 8, 2, 4, 6, 0] == topological_sort(edges)Find all vertices
```
def find_all(source, edges):
reached = set()
cross = lambda e
Solution
-
There's no documentation! How am I supposed to use any of this code? There ought to be docstrings for all the classes and functions.
-
The class
-
Your code requires a graph to be represented as a list giving the neighbours of each vertex, where the neighbours are represented as a list of objects whose
This data structure is going to be rather awkward to use because (a) this make it difficult to attach extra data to vertices; (b) vertices don't normally come in a numerical order (think of cities in a travelling salesman problem, or people in a social network analysis); (c) the user's data structures probably don't have the necessary
This means that the user needs to figure out some way to assign a range of numbers to the vertices, and also to map every neighbour to an object with a
(I think that if you had tried to write user documentation as recommended in my point #1 above, then this problem would have become clearer to you.)
-
Writing:
is unclear: first, you're using the default value for a keyword argument as a way to allocate a local variable, and second,
-
Similarly, instead of
it would be clearer to write:
-
All your algorithms that use
With some refactoring you could avoid the need for the algorithms to define their own
-
-
The implementation of
would be perfectly adequate.
There's no documentation! How am I supposed to use any of this code? There ought to be docstrings for all the classes and functions.
-
The class
Graph contains a single function decorated with @staticmethod. So why bother with a class at all? Python is not Java: not everything needs to be a class.-
Your code requires a graph to be represented as a list giving the neighbours of each vertex, where the neighbours are represented as a list of objects whose
y attribute gives the index of the neighbouring vertex in the list.This data structure is going to be rather awkward to use because (a) this make it difficult to attach extra data to vertices; (b) vertices don't normally come in a numerical order (think of cities in a travelling salesman problem, or people in a social network analysis); (c) the user's data structures probably don't have the necessary
y attribute.This means that the user needs to figure out some way to assign a range of numbers to the vertices, and also to map every neighbour to an object with a
y attribute. A well-designed interface ought to make things simpler for its users, rather than more complicated.(I think that if you had tried to write user documentation as recommended in my point #1 above, then this problem would have become clearer to you.)
-
Writing:
may_enter = lambda v, entered=set(): v not in entered and not entered.add(v)is unclear: first, you're using the default value for a keyword argument as a way to allocate a local variable, and second,
set.add is not documented as returning anything, so why are you testing it as if it is a Boolean? You're relying on the obscure implementation detail that set.add happens to return None, which converts to Boolean as False. It would be clearer (and only a little longer if you count tokens rather than lines) if you kept it simple:entered = set()
def may_enter(v):
if v in entered:
return False
entered.add(v)
return True-
Similarly, instead of
cross is not None and cross(e, v)it would be clearer to write:
if cross is not None:
cross(e, v)-
All your algorithms that use
Graph.dfs supply the same function for may_enter. This suggests to me that you could refactor this code to reduce duplication. The reason why these algorithms define their own may_enter functions (rather than sharing one implementation) is so that they can access the entered set (topological_sort needs to access it from the main loop; has_cycle needs to access it from its cross function).With some refactoring you could avoid the need for the algorithms to define their own
may_enter function (for topological_sort, pass the entered set as a parameter to Graph.dfs; for has_cycle, pass it as an argument to cross).-
Graph.dfs uses the Python stack to maintain its state. But the Python stack is fixed in size and so this will fail for large graphs. You should consider rewriting it to maintain its own stack.-
The implementation of
Edge.__repr__ seems very unclear, especially since this is just a convenience for debugging. A simple implementation like:return 'Edge(y={0.y!r}, weight={0.weight!r})'.format(self)would be perfectly adequate.
Code Snippets
may_enter = lambda v, entered=set(): v not in entered and not entered.add(v)entered = set()
def may_enter(v):
if v in entered:
return False
entered.add(v)
return Truecross is not None and cross(e, v)if cross is not None:
cross(e, v)return 'Edge(y={0.y!r}, weight={0.weight!r})'.format(self)Context
StackExchange Code Review Q#61107, answer score: 2
Revisions (0)
No revisions yet.