HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Grouping Similar Items into buckets

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
bucketsgroupingintoitemssimilar

Problem

I've written some code to group a list of items into arbitrarily sized buckets. If the items are all the same and the and the count is a multiple of the bucket size a single bucket is returned, if the items are different then they are grouped into buckets of a given size.

For example, with a bucket size of 3:

1,2         => [1,2]
1,2,3,4,5   => [1,2,3][4,5]
1,1,1,1,1,1 => [1,1,1,1,1,1]
1,1,1,1,1   => [1,1,1][1,1]


To achieve this I've written the following .Net extension method:

public static IEnumerable> GroupBySize(this IEnumerable source, int groupSize, Func comparer)
{
    var result = new List>();

    if (source.IsNullOrEmpty())
        return result;

    List currentGroup = null;

    for (var i = groupSize; i  comparer(x, possibleGroup.First())))
        {
            if (currentGroup == null)
            {
                currentGroup = new List();
                result.Add(currentGroup);
            }

            currentGroup.AddRange(possibleGroup);
        }
        else
        {
            currentGroup = new List();
            currentGroup.AddRange(possibleGroup);
            result.Add(currentGroup);
            currentGroup = null;
        }
    }

    var remainingItems = source.Count() % groupSize;
    if (remainingItems > 0)
    {
        result.Add(source.Slice(source.Count() - remainingItems, remainingItems).ToList());
    }

    return result;
}

public static IEnumerable Slice(this IEnumerable source, int start, int count)
{
    return source.Skip(start).Take(count);
}

public static bool IsNullOrEmpty(this IEnumerable items)
{
    // Taken from :http://haacked.com/archive/2010/06/10/checking-for-empty-enumerations.aspx
    return items == null || !items.Any();
}


The code passes the tests I've written, but it all (including the method name) feels a bit clunky.

How could this be improved?

Solution

I'm not sure you could make this code less clunky, it seems to me the code is reasonably simple, considering what it has to do.

But I can see other problems in your code:

  • The comparer parameter is probably not necessary in most cases, you can use EqualityComparer.Default for this. But optional comparer makes sense to me.



  • LINQ extension methods are usually lazy and that's what I would expect from this method too. For example, if I do something like collection.GroupBySize(3).First(), it's not necessary to create the whole (possibly very long) result, only to discard most of it. To do this, you can use yield return.



  • You iterate the source collection (using LINQ methods) a lot. This can have terrible performance, especially if the source collection is something like a result of a LINQ database query, because it queries the database many times, which is not necessary. To fix this, you should write your method around a single foreach (var item in source) and don't use methods like Count(), First(), or Skip().

Context

StackExchange Code Review Q#10926, answer score: 3

Revisions (0)

No revisions yet.