patterncsharpMinor
Divide list into batches
Viewed 0 times
dividelistintobatches
Problem
Three questions: is there a more performant way, is there a more suscinct, strike that, a way of expressing this where if you read just the body it is immediately apperent what the alorithm does, and should I be returning and IEnumerable of an IEnumerable, I mean what would be the point over and IEnumerable of IList?
public static IEnumerable> IntoBatches(this IEnumerable list, int size)
{
if (size ();
foreach (var item in list)
{
batch.Add(item);
if (size == ++count)
{
yield return batch;
batch.Clear();
}
}
if (batch.Count > 0) yield return batch;
}
}Solution
Bug
You have 2 big bugs in your method. The first is that you never ever set the
If I call your method with a
Although this is easy to fix like so
this solution takes for a
3: 0.506 ms
13: 0.505 ms
113: 0.505 ms
whereas an array based solution like this (taken from here)
takes
3: 0.270 ms
13: 0.270 ms
113: 0.270 ms
Edit
That
E.g passed in a
Fixed version
You have 2 big bugs in your method. The first is that you never ever set the
count variable to 0 and the second that you are yielding the List.If I call your method with a
List containing 10000 ints and do a ToList() on the result I get 2 Lists both containing 9997 ints.Although this is easy to fix like so
public static IEnumerable> IntoBatches(this IEnumerable list, int size)
{
if (size ();
batch.Add(item);
if (size == ++count)
{
yield return batch;
batch = new List();
count = 0;
}
}
if (batch.Count > 0) yield return batch;
}
}this solution takes for a
List having 10000 items with size:3: 0.506 ms
13: 0.505 ms
113: 0.505 ms
whereas an array based solution like this (taken from here)
public static IEnumerable> Chunkify(this IEnumerable source, int size)
{
using (var iter = source.GetEnumerator())
{
while (iter.MoveNext())
{
var chunk = new T[size];
chunk[0] = iter.Current;
for (int i = 1; i < size && iter.MoveNext(); i++)
{
chunk[i] = iter.Current;
}
yield return chunk;
}
}
}takes
3: 0.270 ms
13: 0.270 ms
113: 0.270 ms
Edit
That
Chunkify() method unfortunately has a bug, which is for a passed in IEnumerable with a size which isn't dividable by the passed in chunk size will produce to many items.E.g passed in a
int[] with values 1,2,3,4 and an size argument of 3 will produce 1,2,3,4,0,0.Fixed version
public static IEnumerable> Chunkify(this IEnumerable source, int size)
{
int count = 0;
using (var iter = source.GetEnumerator())
{
while (iter.MoveNext())
{
var chunk = new T[size];
count = 1;
chunk[0] = iter.Current;
for (int i = 1; i < size && iter.MoveNext(); i++)
{
chunk[i] = iter.Current;
count++;
}
if (count < size)
{
Array.Resize(ref chunk, count);
}
yield return chunk;
}
}
}Code Snippets
public static IEnumerable<IEnumerable<T>> IntoBatches<T>(this IEnumerable<T> list, int size)
{
if (size < 1)
{
yield return list;
}
else
{
var count = 0;
foreach (var item in list)
{
var batch = new List<T>();
batch.Add(item);
if (size == ++count)
{
yield return batch;
batch = new List<T>();
count = 0;
}
}
if (batch.Count > 0) yield return batch;
}
}public static IEnumerable<IEnumerable<T>> Chunkify<T>(this IEnumerable<T> source, int size)
{
using (var iter = source.GetEnumerator())
{
while (iter.MoveNext())
{
var chunk = new T[size];
chunk[0] = iter.Current;
for (int i = 1; i < size && iter.MoveNext(); i++)
{
chunk[i] = iter.Current;
}
yield return chunk;
}
}
}public static IEnumerable<IEnumerable<T>> Chunkify<T>(this IEnumerable<T> source, int size)
{
int count = 0;
using (var iter = source.GetEnumerator())
{
while (iter.MoveNext())
{
var chunk = new T[size];
count = 1;
chunk[0] = iter.Current;
for (int i = 1; i < size && iter.MoveNext(); i++)
{
chunk[i] = iter.Current;
count++;
}
if (count < size)
{
Array.Resize(ref chunk, count);
}
yield return chunk;
}
}
}Context
StackExchange Code Review Q#122471, answer score: 7
Revisions (0)
No revisions yet.