HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Split IEnumerable by predicate

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
splitpredicateienumerable

Problem

In order to divide an IEnumerable in C# by a predicate, I implemented the following (in a static class):

public static IEnumerable> SplitBeforeIf(this IEnumerable input, Func pred)
{
    //This exists to store the predicate satisfier, which I kept as
    //the first T in the inner IEnumerable (except, maybe the first one).
    T predSatis = default(T);
    while(input.Any())
    {
        var temp = input.TakeWhile(v => !pred(v));
        //I was told that the below is the best test for default-ness.
        //(I assume that pred(default(T)) is always false.)
        if(predSatis == null || predSatis.Equals(default(T)))
            yield return temp;
        else
            yield return (new List{predSatis}).Concat(temp);

        input = input.SkipWhile(v => !pred(v));
        predSatis = input.FirstOrDefault();
        input = input.Skip(1);
    }
    if(predSatis != null && !predSatis.Equals(default(T)))
        yield return new List{predSatis};
}


This returns as follows:

  • new List{1,2,3,4,5}.SplitBeforeIf(v => v % 2 == 0) gives {{1}, {2,3}, {4,5}}



  • new List{6,1,2,3,4,5}.SplitBeforeIf(v => v % 2 == 0) gives {{}, {6, 1}, {2,3}, {4, 5}}



  • new List{1,2,3,4,5,6}.SplitBeforeIf(v => v % 2 == 0) gives {{1}, {2,3}, {4,5}, {6}}.



I have 3 questions (besides any other comments on the code):

  • I see that, despite my best efforts, this is \$O(n^2)\$. Why? Can this be improved for a generic IEnumerable?



  • This does not support a stateful predicate or a stateful IEnumerable input. Can this be remedied?



  • This looks awfully like Clojure's partition-by. How does Clojure avoid the first issue (it may also have the second, but functional programming dislikes state anyway)?



Being a little pedantic about these things, I also would prefer solutions that don't accumulate values into another data structure (just to see if full laziness is possible).

Solution

So I thought I'd try this out with a test enumerable.

public class NumberValues : IEnumerable
{
    public NumberValues(int startValue, int endValue)
    {
        _endValue = endValue;
        _startValue = startValue;
        counter = 0;
    }
    public NumberValues(int endValue) : this(0, 10) { }

    public int counter { get; set; }

    public IEnumerator GetEnumerator()
    {

        var iterator = _startValue;
        while (iterator < _endValue)
        {
            Trace.Write(iterator + ",");
            iterator++;
            counter++;
            yield return iterator;
        }

    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}


followed closely by

[TestCase(0,5)]
    [TestCase(1,5)]
    [TestCase(0,6)]
    [TestCase(1,6)]
    [TestCase(0,10)]
    [TestCase(1,10)]
    public void TestSkipBeforeIf(int start, int end)
    {
        var numberValue = new NumberValues(start,end);
        numberValue.SplitBeforeIf(i => i%2 == 0).ToList();
    }


which results in the following output.

0,5 > 0,0,1,0,1,2,0,1,2,3,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,  - 25
1,5 > 1,1,1,2,1,2,3,1,2,3,4,1,2,3,4,1,2,3,4,  - 19
0,6 > 0,0,1,0,1,2,0,1,2,3,0,1,2,3,4,0,1,2,3,4,5,0,1,2,3,4,5,  - 27
1,6 > 1,1,1,2,1,2,3,1,2,3,4,1,2,3,4,5,1,2,3,4,5,  - 21
0,10 > 0,0,1,0,1,2,0,1,2,3,0,1,2,3,4,0,1,2,3,4,5,0,1,2,3,4,5,6,0,1,2,3,4,5,6,7,0,1,2,3,4,5,6,7,8,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,  - 65
1,10 >     1,1,1,2,1,2,3,1,2,3,4,1,2,3,4,5,1,2,3,4,5,6,1,2,3,4,5,6,7,1,2,3,4,5,6,7,8,1,2,3,,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,  - 55

0,20 > 230
0,40 > 860
0,100 > 5150


But why?

while (input.Any())
        {
            var temp = input.TakeWhile(v => !pred(v));
            if (predSatis == null || predSatis.Equals(default(T)))
                yield return temp;
            else
                yield return (new List { predSatis }).Concat(temp);

            input = input.SkipWhile(v => !pred(v));
            predSatis = input.FirstOrDefault();
            input = input.Skip(1);
        }


if you evaluate your 2 lines

input = input.SkipWhile(v => !pred(v));
predSatis = input.FirstOrDefault();


over iterations of your while loop, it is necessary to expand the input variable for each iteration.

input.SkipWhile(pred).FirstOrDefault
input.SkipWhile(pred).SkipWhile(pred).FirstOrDefault
input.SkipWhile(pred).SkipWhile(pred).SkipWhile(pred).FirstOrDefault
input.SkipWhile(pred).SkipWhile(pred).SkipWhile(pred).SkipWhile(pred).FirstOrDefault


having seen this, we now consider the evaluation of input.SkipWhile(!pred)

so over 100 iterations

pred = x => x % 2 == 0 : 5150
pred = x => x % 3 == 0 : 3600
pred = x => x % 4 == 0 : 2625
pred = x => x % 5 == 0 : 2120
pred = x => x % 6 == 0 : 1849
pred = x => x % 15 == 0 : 837
pred = x => x % 20 == 0 : 605
pred = x => x % 25 == 0 : 504
pred = x => x % 30 == 0 : 564
pred = x => x % 30 == 0 : 413


Your functions behaviour is therefore dependent not just on the size of your data, but on the form of your predicate.

For your own analysis, this is what I used for evaluating the predicate

[TestCase(1,5000,10)]
   public void TestSkipBeforeIf(int start, int end, int mod)
   {
        var numberValue = new NumberValues(start,end);
        numberValue.SplitBeforeIf(i => i%mod == 0).ToList();
   }


I'm not clear what you meant by stateful predicate or IEnumerable, however in terms of the IEnumerable I have maintained a count as state. Similarly I could maintain a count for the pred, or add some bizarre checking behaviour.

Func predStateful = (y,x) =>
{
    if (x == y.Check(x))
    {
        return false;
    }
    return x%mod == 0;
};
Func pred = x => predStateful(MyCheckingObject,x);
numberValue.SplitBeforeIf(pred).ToList();


Thanks for the interesting question.

EDIT: Modified Chunk function using a predicate.
original

public static IEnumerable> Chunk(
                    this IEnumerable values,
                    Func pred)
    {
        using (var enumerator = values.GetEnumerator())
        {
            while (enumerator.MoveNext())
            {
                yield return GetChunk(enumerator, pred).ToList();
            }
        }
    }
    private static IEnumerable GetChunk(
                     IEnumerator enumerator,
                     Func pred)
    {
        do
        {
            yield return enumerator.Current;
        } while (!pred(enumerator.Current) && enumerator.MoveNext());
    }

Code Snippets

public class NumberValues : IEnumerable<int>
{
    public NumberValues(int startValue, int endValue)
    {
        _endValue = endValue;
        _startValue = startValue;
        counter = 0;
    }
    public NumberValues(int endValue) : this(0, 10) { }

    public int counter { get; set; }

    public IEnumerator<int> GetEnumerator()
    {

        var iterator = _startValue;
        while (iterator < _endValue)
        {
            Trace.Write(iterator + ",");
            iterator++;
            counter++;
            yield return iterator;
        }

    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}
[TestCase(0,5)]
    [TestCase(1,5)]
    [TestCase(0,6)]
    [TestCase(1,6)]
    [TestCase(0,10)]
    [TestCase(1,10)]
    public void TestSkipBeforeIf(int start, int end)
    {
        var numberValue = new NumberValues(start,end);
        numberValue.SplitBeforeIf(i => i%2 == 0).ToList();
    }
0,5 > 0,0,1,0,1,2,0,1,2,3,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,  - 25
1,5 > 1,1,1,2,1,2,3,1,2,3,4,1,2,3,4,1,2,3,4,  - 19
0,6 > 0,0,1,0,1,2,0,1,2,3,0,1,2,3,4,0,1,2,3,4,5,0,1,2,3,4,5,  - 27
1,6 > 1,1,1,2,1,2,3,1,2,3,4,1,2,3,4,5,1,2,3,4,5,  - 21
0,10 > 0,0,1,0,1,2,0,1,2,3,0,1,2,3,4,0,1,2,3,4,5,0,1,2,3,4,5,6,0,1,2,3,4,5,6,7,0,1,2,3,4,5,6,7,8,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,  - 65
1,10 >     1,1,1,2,1,2,3,1,2,3,4,1,2,3,4,5,1,2,3,4,5,6,1,2,3,4,5,6,7,1,2,3,4,5,6,7,8,1,2,3,,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,  - 55

0,20 > 230
0,40 > 860
0,100 > 5150
while (input.Any())
        {
            var temp = input.TakeWhile(v => !pred(v));
            if (predSatis == null || predSatis.Equals(default(T)))
                yield return temp;
            else
                yield return (new List<T> { predSatis }).Concat(temp);

            input = input.SkipWhile(v => !pred(v));
            predSatis = input.FirstOrDefault();
            input = input.Skip(1);
        }
input = input.SkipWhile(v => !pred(v));
predSatis = input.FirstOrDefault();

Context

StackExchange Code Review Q#134363, answer score: 8

Revisions (0)

No revisions yet.