HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Topological sorting colored tree

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
sortingtopologicalcoloredtree

Problem

EDIT: The most general case I need is not a tree but any Directed Acyclic Graph.

I have a directed acyclic graph.

I need to sort it in a list so that in the list every node comes after any node it can walk to in the graph.

So far so good, just use a topological sorting algorithm.

However, I have an additional constraint: Each node in the graph is colored, and I need to find a sorting so that the list is as cohesive as possible. In other words, if you walk the list, you should see a change in colors as few times as possible.

Other thoughts:

-
I need an efficient algorithm. No iterating through all possible sortings.

-
The set of colors is typically much smaller than the set of nodes in the graph.

-
If no efficient exact algorithm is possible, I would be happy with a heuristic one that is likely to produce good results.

-
There is some information that one could potentially use for the heuristic: The colors are also partially ordered, and this ordering of the colors is correlated to the ordering of the nodes. I.e. a node is likely (but not guaranteed) to have a color that is "smaller than" the color of its parent node.

-
You can think of the problem this way: I have a set of jobs, and each job can depend on the output of several other jobs (if x depends on y there is an arrow from x to y). There is an overhead associated with setting up the data needed for each job, but this overhead is significantly reduced if the previous job had the same "color". The whole set of jobs is typically executed many thousand times, so if one can find an not too expensive way to improve the order of the jobs, that will pay off.

So far I have come up with this algorithm which works well most times, but not always:

first pass:

Define the order "<=" so that x <= y if y can walk to x in the graph.

find any topological sort of the nodes and put it in the list L
(L is sorted from smallest to largest w.r.t the partial ordering "<=" )

```
W = []
for x in L:
find the earlie

Solution

Suppose you know the list of colors that appear in contiguous parts of an optimal ordering. Then, you can easily find this optimal ordering: simply put as many nodes of the current color in your list as the parent-child order allows and proceed with the next color in your list. This greedy strategy is optimal, because once you have started a color, you will not 'regret' taking a node of the same color later as it adds no cost and only gives more options. If you put all 'free' nodes that already have their parents processed in separate lists, this can be done in $O(n)$, where $n$ is the number of nodes in your forest.

Of course, we do not know this list, so we should figure out how to find it. The number of color lists you would have to check in an uninformed search is $c(c-1)^{k}$, where $c$ is the number of colors you have and $k$ the minimal number of color switches in the final order. If you're lucky, $c$ and $k$ are sufficiently small for a brute-force search and we are done.

However, this will quickly become unfeasible, so I doubt you can afford to spend the $O(c(c-1)^k n)$ time required to do a brute-force search$^1$. Finding an optimal list 'looks hard'. I have no formal proof of e.g. NP-hardness, but it is clear that decisions made early on depend on the structure of the entire forest, so the most natural approach is to try to create subdivisions of by 'cutting' the forest to get subproblems and use e.g. dynamic programming. However, since the nodes can be ordered in many ways, I do not see how to represent these cuts with only polynomially many states. Another hint is that you can see this as a special case of the scheduling problem with partial order requirements.

So, since I think it likely we cannot find an efficient exact solution, you should try an heuristic approach or can perhaps even find an approximation algorithm. I will finish with some observations that may help you in developing such an approach:

-
If we decide on the last color in list first and build the ordering backwards, observe that we can only choose nodes where all descendants have the same color: if not, there is a node of another color that should be later in the list. Since we would select all children of the same color as a parent if we had chosen a front to back order in an optimal ordering, we should select only subtrees of the same color where the root has no parent of the same color. It is possible that for some colors, there are no such subtrees, which means that we have less options for the final color.

-
If there is a color for which you can select all remaining nodes, you should select that color. This may occur quite often if the correlation between the color ordering and node ordering is high.

-
One idea for an heuristic is to select the color that is 'blocking' the most nodes of other colors: you can look at the total number of 'chains' of nodes of the same color that are accessible after removing the color or at the number for a specific color you would like to remove first or the maximum number over all colors.

$^1$: You could be slightly more clever and do recursive backtracking where you can prune some searches, but this has the same worst case running time.

Context

StackExchange Computer Science Q#103603, answer score: 4

Revisions (0)

No revisions yet.