HiveBrain v1.2.0
Get Started
← Back to all entries
snippetpythonMinor

Topological sort

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
sorttopologicalstackoverflow

Problem

The algorithm works in steps:

-
Produce the graph in the form of an adjacency list

e.g. this graph

0 - - 2
    \
      1 -- 3


produces this adjacency list

{0: [], 1: [0], 2: [0], 3: [1, 2]}


0 depends on nothing, 1 depends on 0 etc..

-
Iterate through the graph and find nodes that does not have any dependencies

Questions:

-
What should the correct running time be for a well implemented topological sort. I am seeing different opinions:

Wikipedia says: \$O(log^2(n)\$)

Geeksforgeeks says: \$O(V+E)\$

-
My implementation is running at \$O(V*E)\$. Because at worst, I will need to loop through the graph V times and each time I will need to check E items. How do I make my implementation into linear time.

```
def produce_graph(prerequisites):
adj = {}
for course in prerequisites:
if course[0] in adj:
# append prequisites
adj[course[0]].append(course[1])
else:
adj[course[0]] = [course[1]]

# ensure that prerequisites are also in the graph
if course[1] not in adj:
adj[course[1]] = []

return adj

def toposort(graph):
sorted_courses = []
while graph:

# we mark this as False
# In acyclic graph, we should be able to resolve at least
# one node in each cycle
acyclic = False
for node, predecessors in graph.items():
# here, we check whether this node has predecessors
# if a node has no predecessors, it is already resolved,
# we can jump straight to adding the node into sorted
# else, mark resolved as False
resolved = len(predecessors) == 0
for predecessor in predecessors:
# this node has predecessor that is not yet resolved
if predecessor in graph:
resolved = False
break
else:
# this particular predecessor is resolved
resolved =

Solution

Documentation

Please add documentation to tell what the function is supposed to do and what input it is expecting.

Avoid repetition

In produce_graph, you repeat course[0] many times which smells like something wrong.

It can be rewritten :

def produce_graph(prerequisites):
    adj = {}
    for course in prerequisites:
        first, second = course[0], course[1]
        if first in adj:
            # append prequisites
            adj[first].append(second)
        else:
            adj[first] = [second]

    # ensure that prerequisites are also in the graph
    if second not in adj:
        adj[second] = []

    return adj


Using the right tool

What you are doing in produce_graph can be easily achieved using the set_default function :

adj = {}
for course in prerequisites:
    first, second = course[0], course[1]
    adj.setdefault(first, []).append(second)


(One might also suggest using defaultdict, I'll let you pick your favorite).

Using the right tool is also choosing the right data structure. It would probably make sense for produce_graph to take a list of tuples. Also, if you know that each elements contains 2 elements, you can use tuple unpacking like this :

for course in prerequisites:
    first, second = course


or

for first, second in prerequisites:
    adj.setdefault(first, []).append(second)


Avoid premature optimisation/keep it simple

Regarding :

resolved = len(predecessors) == 0
        for predecessor in predecessors:
            # this node has predecessor that is not yet resolved
            if predecessor in graph:
                resolved = False
                break
            else:
                # this particular predecessor is resolved
                resolved = True


At the beginning, you iterate resolved to True if predecessors is True, False otherwise. I think this does the same as :

resolved = True
        for predecessor in predecessors:
            # this node has predecessor that is not yet resolved
            if predecessor in graph:
                resolved = False
                break


As resolved will be False if and only if an element verifies the property in graph.

Also, this can be rewritten using builtin all/any :

if all(p not in graph for p in predecessors):

Code Snippets

def produce_graph(prerequisites):
    adj = {}
    for course in prerequisites:
        first, second = course[0], course[1]
        if first in adj:
            # append prequisites
            adj[first].append(second)
        else:
            adj[first] = [second]

    # ensure that prerequisites are also in the graph
    if second not in adj:
        adj[second] = []

    return adj
adj = {}
for course in prerequisites:
    first, second = course[0], course[1]
    adj.setdefault(first, []).append(second)
for course in prerequisites:
    first, second = course
for first, second in prerequisites:
    adj.setdefault(first, []).append(second)
resolved = len(predecessors) == 0
        for predecessor in predecessors:
            # this node has predecessor that is not yet resolved
            if predecessor in graph:
                resolved = False
                break
            else:
                # this particular predecessor is resolved
                resolved = True

Context

StackExchange Code Review Q#94499, answer score: 2

Revisions (0)

No revisions yet.