HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Generating a non binary tree structure

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
nongeneratingbinarystructuretree

Problem

I am writing a method to create a folder tree. I am getting the data from a database and creating the list of nodes. Is there a way to get the method to execute quicker? Perhaps using parallel task library?

Each node looks like this:

public class Node : INode
{
    public int Id { get; set; }
    public int? ParentId { get; set; }
    public string DisplayText { get; set; }
    public bool DefaultState { get; set; }
    public List ChildItems { get; set; }
}


The fastest implementation I have been able to come up with has been just two for loops using recursion:

/// 
/// For loop incrementing: 00:00:28.2218157
/// 
/// 
/// 
private void GetChildFoldersIncrement(IRepository folderRepo, List folders)
{
    for (var i = 0; i  f.ParentFolderID == folderId).ToList();
        if (childFolders.Count == 0) continue;
        for (var j = 0; j ()
            };

            if (folders[i].ChildItems == null) folders[i].ChildItems = new List();
            folders[i].ChildItems.Add(node);
        }

        if (folders[i].ChildItems.Count > 0) GetChildFoldersIncrement(folderRepo, folders[i].ChildItems);
    }
}


I've attempted to optimize the loops by decrementing them but it was actually slightly slower:

```
///
/// For loop decrement: 00:00:28.3496396
///
///
///
private void GetChildFoldersDecrement(IRepository folderRepo, List folders)
{
if (folders == null) return;
var maxFolders = folders.Count();
for (var i = maxFolders - 1; i >= 0; --i)
{
var folderId = folders[i].Id;
var childFolders = folderRepo.FindByExp(f => f.ParentFolderID == folderId).ToList();
var maxChildFolders = childFolders.Count();
for (var j = maxChildFolders - 1; j >= 0; --j)
{
var node = new Node
{
Id = childFolders[j].FolderID,
ParentId = childFolders[j].ParentFolderID,
DisplayText = childFolders[j].FolderName,
DefaultState = false,

Solution

Let's start from your code. The recursion can be implemented in a much clearer way if the method returns the collection of all child folders, instead of updating parent folder's ChildItems collection. In this case it could look as simple as

private IEnumerable GetChildFolderNodes(IRepository folderRepo, int folderId)
{
    var childFolders = folderRepo.FindByExp(f => f.ParentFolderID == folderId).ToArray();

    return childFolders.Select(folder => new Node
    {
        Id = folder.FolderID,
        ParentId = folder.ParentFolderID,
        DisplayText = folder.FolderName,
        DefaultState = false,
        ChildItems = GetChildFolderNodes(folderRepo, folder.FolderID).ToList()
    });
}


There are a number of articles describing why a IRepository is quite a bad idea when defined on top of ORM framework. Main reasons are leaked abstractions and abstraction on top of abstraction. And your example shows why - any optimization requires the knowledge of underlying implementation (Entity Framework).

Running this process in multiple threads may help, but will put additional load on the database server and thus is not scalable. Note that Entity Framework's DbContext is not thread-safe, so you cannot share the IRepository in multiple threads unless you spin up a new DbContext per request there (which I wouldn't recommend doing).

Assuming that you're using Entity Framework 6.0, FindByExp method returns IQueryable, you can try asynchronous implementation of the data retrieval:

private async Task> GetChildFolderNodesAsync(Func> folderRepoFactory, int folderId)
{
    Folder[] childFolders;
    using (var folderRepo = folderRepoFactory())
    {
        childFolders = await folderRepo.FindByExp(f => f.ParentFolderID == folderId).ToArrayAsync();
    }

    return (await Task.WhenAll(childFolders.Select(async folder => new Node
    {
        Id = folder.FolderID,
        ParentId = folder.ParentFolderID,
        DisplayText = folder.FolderName,
        DefaultState = false,
        ChildItems = await GetChildFolderNodesAsync(folderRepoFactory, folder.FolderID)
    }))).ToList();
}


Note that in this case you would need to spin up a new DbContext per request (which I expressed as a repository factory).

And finally, assuming that this is a critical path in your application, the best way to optimize the code you're running is to reduce the number of round trips to database server, instead of trying to squeeze microseconds on a .NET layer.

For example, if your underlying DB engine is SQL Server, you can leverage the recursive CTE to load full folder hierarchy in one go. It can be optimized even further if CTE returns the depth level for folders, but I'll leave it for you as an exercise:

private List GetChildFolderNodesAsync(DbSet folderSet, int folderId)
{
    var allFolders = folderSet.SqlQuery(@"WITH cte (FolderID, ParentFolderID, FolderName) AS
    (
        SELECT  FolderID, ParentFolderID, FolderName
        FROM    Folders
        WHERE   ParentFolderID = @p0
    UNION ALL
        SELECT  F.FolderID, F.ParentFolderID, F.FolderName
        FROM    Folders F
        JOIN    cte
        ON      F.ParentFolderID = cte.FolderID
    )
    SELECT  *
    FROM cte", folderId).AsNoTracking();

    //Converting all folders into nodes, but skipping the ChildItems property for now...
    var nodesByParent = allFolders
        .GroupBy(folder => folder.ParentFolderID)
        .ToDictionary(folders => folders.Key, folders => folders.Select(folder => new Node
        {
            Id = folder.FolderID,
            ParentId = folder.ParentFolderID,
            DisplayText = folder.FolderName,
            DefaultState = false,
        }).ToList());

    List childNodes;

    //Now that we have all nodes grouped by parent, we can look up and assign ChildItems for all nodes
    foreach (var node in nodesByParent.Values.SelectMany(nodes => nodes))
    {
        node.ChildItems = nodesByParent.TryGetValue(node.Id, out childNodes)
            ? childNodes
            : new List();
    }

    return nodesByParent.TryGetValue(folderId, out childNodes)
        ? childNodes
        : new List();
}

Code Snippets

private IEnumerable<Node> GetChildFolderNodes(IRepository<Folder> folderRepo, int folderId)
{
    var childFolders = folderRepo.FindByExp(f => f.ParentFolderID == folderId).ToArray();

    return childFolders.Select(folder => new Node
    {
        Id = folder.FolderID,
        ParentId = folder.ParentFolderID,
        DisplayText = folder.FolderName,
        DefaultState = false,
        ChildItems = GetChildFolderNodes(folderRepo, folder.FolderID).ToList()
    });
}
private async Task<List<Node>> GetChildFolderNodesAsync(Func<IRepository<Folder>> folderRepoFactory, int folderId)
{
    Folder[] childFolders;
    using (var folderRepo = folderRepoFactory())
    {
        childFolders = await folderRepo.FindByExp(f => f.ParentFolderID == folderId).ToArrayAsync();
    }

    return (await Task.WhenAll(childFolders.Select(async folder => new Node
    {
        Id = folder.FolderID,
        ParentId = folder.ParentFolderID,
        DisplayText = folder.FolderName,
        DefaultState = false,
        ChildItems = await GetChildFolderNodesAsync(folderRepoFactory, folder.FolderID)
    }))).ToList();
}
private List<Node> GetChildFolderNodesAsync(DbSet<Folder> folderSet, int folderId)
{
    var allFolders = folderSet.SqlQuery(@"WITH cte (FolderID, ParentFolderID, FolderName) AS
    (
        SELECT  FolderID, ParentFolderID, FolderName
        FROM    Folders
        WHERE   ParentFolderID = @p0
    UNION ALL
        SELECT  F.FolderID, F.ParentFolderID, F.FolderName
        FROM    Folders F
        JOIN    cte
        ON      F.ParentFolderID = cte.FolderID
    )
    SELECT  *
    FROM cte", folderId).AsNoTracking();

    //Converting all folders into nodes, but skipping the ChildItems property for now...
    var nodesByParent = allFolders
        .GroupBy(folder => folder.ParentFolderID)
        .ToDictionary(folders => folders.Key, folders => folders.Select(folder => new Node
        {
            Id = folder.FolderID,
            ParentId = folder.ParentFolderID,
            DisplayText = folder.FolderName,
            DefaultState = false,
        }).ToList());

    List<Node> childNodes;

    //Now that we have all nodes grouped by parent, we can look up and assign ChildItems for all nodes
    foreach (var node in nodesByParent.Values.SelectMany(nodes => nodes))
    {
        node.ChildItems = nodesByParent.TryGetValue(node.Id, out childNodes)
            ? childNodes
            : new List<Node>();
    }

    return nodesByParent.TryGetValue(folderId, out childNodes)
        ? childNodes
        : new List<Node>();
}

Context

StackExchange Code Review Q#82504, answer score: 2

Revisions (0)

No revisions yet.