snippetcsharpMinor
How to improve this algorithm for building a graphs closures
Viewed 0 times
thisgraphsbuildingimprovealgorithmforhowclosures
Problem
We have a large directed acyclic graph and a requirement to filter it by allowing a user to select certain nodes and then we remove all nodes that are not selected nodes, ancestors of those nodes, or descendants of those nodes.
As you can see in the above image nodes 6 and 9 have been selected and so nodes which do not route to those nodes are removed.
We are trying to build the closures (transitive closures?) of the graph so we can quickly filter it. We used the below code which was very slow. Profiling it shows around half of the time is spent in garbage collection.
```
public static class NodeClosureAnalyser
{
public static NodeClosures Analyse(Node root)
{
var results = new NodeClosures
{
AncestorClosures = new Dictionary>(),
DescendantClosures = new Dictionary>()
};
//walk the tree and build the closures lists
var stack = new Stack();
stack.Push(root);
var route = new Stack();
while (stack.Count != 0)
{
var current = stack.Pop();
//keep track of the route to root
while (route.Any() && !route.Peek().Children.Any(n => n.Id == current.Id))
{
route.Pop();
}
route.Push(current);
AddNodeRelationships(results, current, route);
foreach (var child in current.Children)
{
stack.Push(child);
}
}
return results;
}
private static void AddNodeRelationships(NodeClosures results, Node current, IEnumerable route)
{
foreach (var node in route)
{
results.AncestorClosures.GetOrCreateValuesList(current.Id).Add(node.Id);
results.DescendantClosures.Get
As you can see in the above image nodes 6 and 9 have been selected and so nodes which do not route to those nodes are removed.
We are trying to build the closures (transitive closures?) of the graph so we can quickly filter it. We used the below code which was very slow. Profiling it shows around half of the time is spent in garbage collection.
```
public static class NodeClosureAnalyser
{
public static NodeClosures Analyse(Node root)
{
var results = new NodeClosures
{
AncestorClosures = new Dictionary>(),
DescendantClosures = new Dictionary>()
};
//walk the tree and build the closures lists
var stack = new Stack();
stack.Push(root);
var route = new Stack();
while (stack.Count != 0)
{
var current = stack.Pop();
//keep track of the route to root
while (route.Any() && !route.Peek().Children.Any(n => n.Id == current.Id))
{
route.Pop();
}
route.Push(current);
AddNodeRelationships(results, current, route);
foreach (var child in current.Children)
{
stack.Push(child);
}
}
return results;
}
private static void AddNodeRelationships(NodeClosures results, Node current, IEnumerable route)
{
foreach (var node in route)
{
results.AncestorClosures.GetOrCreateValuesList(current.Id).Add(node.Id);
results.DescendantClosures.Get
Solution
Since your nodes are double-linked (i.e. each has a list of all parents and children), creating the list of closures for all children of a certain node (especially the root node) seems like an overkill, especially if you're doing that whenever you start to filter. If you have a doubly linked graph, descendant and ancestor lists seem pretty cheap to get for each node, and for a large graph you will have to waste an awful lot of memory to keep them around.
If I am not mistaken, for each "chosen" node, you need to mark that node as safe, mark its children as safe, then traverse back to the root and mark all nodes you encounter as safe, and finally remove all non marked nodes. That should be much less work than you're doing now, if I didn't misunderstand the problem completely.
Some minor issues:
-
To merge two
-
Btw, by looking at your code, it seems that
I.e. wouldn't this work:
(
If I am not mistaken, for each "chosen" node, you need to mark that node as safe, mark its children as safe, then traverse back to the root and mark all nodes you encounter as safe, and finally remove all non marked nodes. That should be much less work than you're doing now, if I didn't misunderstand the problem completely.
Some minor issues:
-
To merge two
HashSets, it's much cheaper to use HashSet.UnionWith, than to create a new HastSet each time (like you're doing in your Filter method).-
Btw, by looking at your code, it seems that
Node.AnyDescendant should iterate through Children, not Parents.I.e. wouldn't this work:
public static Node Filter(Node startNode, IEnumerable includedNodes)
{
var explicitlyIncludedNodes = startNode
.Descendants()
.Where(n => includedNodes.Contains(n.Id));
var nodesToKeep = new HashSet();
foreach (var node in explicitlyIncludedNodes)
{
nodesToKeep.UnionWith(node.Ancestors().Select(n => n.Id));
nodesToKeep.UnionWith(node.Descendants().Select(n => n.Id));
}
return BreadthFirstDeletion(startNode, nodesToKeep, includedNodes.ToArray());
}Ancestors() should simply return all ancestors all the way to the root:public IEnumerable Ancestors()
{
var stack = new Stack();
stack.Push(this);
while (stack.Count != 0)
{
var current = stack.Pop();
yield return current;
foreach (var child in current.Parents)
stack.Push(child);
}
}(
Descendants the same way but for Children instead of Parents)Code Snippets
public static Node Filter(Node startNode, IEnumerable<int> includedNodes)
{
var explicitlyIncludedNodes = startNode
.Descendants()
.Where(n => includedNodes.Contains(n.Id));
var nodesToKeep = new HashSet<int>();
foreach (var node in explicitlyIncludedNodes)
{
nodesToKeep.UnionWith(node.Ancestors().Select(n => n.Id));
nodesToKeep.UnionWith(node.Descendants().Select(n => n.Id));
}
return BreadthFirstDeletion(startNode, nodesToKeep, includedNodes.ToArray());
}public IEnumerable<Node> Ancestors()
{
var stack = new Stack<Node>();
stack.Push(this);
while (stack.Count != 0)
{
var current = stack.Pop();
yield return current;
foreach (var child in current.Parents)
stack.Push(child);
}
}Context
StackExchange Code Review Q#36198, answer score: 2
Revisions (0)
No revisions yet.