HiveBrain v1.2.0
Get Started
← Back to all entries
principleswiftMinor

Recursive flattening of Swift sequences - an overly complicated approach

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
swiftoverlyapproachcomplicatedrecursivesequencesflattening

Problem

I recently read and answered Martin R's Recursive flattening of Swift sequences and continued to play around with the code until I arrived at something that was both pretty cool and possibly an abomination. So I figured I'd come back to CR with it and ask others' opinions.

If you want a full context, you should refer to Martin's question, but I'll reiterate the essential parts here:

The function seqentialFlatten(_:_,children:_), takes a sequence and a function that maps an element of that sequence into a sequence of the same type, and lazily returns an AnySequence with all elements that can be generated by repeatedly mapping elements into sequences. One use case of this is getting a list of all subviews of a view:

let someView : UIView = ...
let views = sequentialFlatten([someView], children: { $0.subviews as! [UIView] })


What I've done in this approach is used a more functional style to implement this function, using map and reduce. To achieve this, I've implemented the + operator for Generators, made a generic struct to make it possible to have local lazy variables and wrapped a whole bunch of stuff into AnyGenerators.

And therein lies the problem, I feel perhaps this bit of code has introduce too many new concepts for not much gain. I'll first post the full code and then go through some of my concerns.

```
func +(lhs: G, rhs: G) -> AnyGenerator {
var leftGen = lhs
var leftEmpty = false
var rightGen = rhs
var rightEmpty = false

return AnyGenerator {
if !leftEmpty, let elem = leftGen.next() {
return elem
}
leftEmpty = true

if !rightEmpty, let elem = rightGen.next() {
return elem
}
rightEmpty = true
return nil
}
}

struct Lazy {
private let get: () -> E
lazy var val: E = self.get()

init(@autoclosure(escaping) _ getter: () -> E) {
get = getter
}
}

func sequentialFlatten(seq : S, children : S.Generator.Element -> S) -> A

Solution

That is an interesting approach. There are some simplifications and
performance improvements possible. To measure the performance I have used
the following simple code:

let d1 = NSDate()
let gen = sequentialFlatten([0], children: { n in n < 100_000 ? [2*n+1, 2*n+2] : []})
var c = 0
for _ in gen { c += 1 }
let d2 = NSDate()
print(c, d2.timeIntervalSinceDate(d1))


and with your sequentialFlatten method, that takes about 3.1 sec
on a 2.3 GHz MacBook Pro (compiled in Release mode).

First note that the explicit parameter list in the closure
which is passed to AnySequence { } is not needed, it can be
inferred:

func sequentialFlatten(seq : S, children : S.Generator.Element -> S) -> AnySequence {
    return AnySequence {
        seq.map {
            let firstGen = AnyGenerator([$0].generate())
            var secondGen = Lazy(sequentialFlatten(children($0), children: children).generate())
            return firstGen + AnyGenerator {
                secondGen.val.next()
            }
        }.reduce(AnyGenerator { nil }, combine: +)
    }
}


Next, the Lazy wrapping seems unnecessary to me. Perhaps I am overlooking something, but it works just as well without.
This also makes one AnyGenerator wrapping obsolete:

func sequentialFlatten(seq : S, children : S.Generator.Element -> S) -> AnySequence {
    return AnySequence {
        seq.map {
            let firstGen = AnyGenerator([$0].generate())
            let secondGen = sequentialFlatten(children($0), children: children).generate()
            return firstGen + secondGen
        }.reduce(AnyGenerator { nil }, combine: +)
    }
}


The execution time is now 2.5 sec.

Your more general

func +(lhs: G1, rhs: G2) -> AnyGenerator


operator can be used, but we have to assist the compiler with
an additional parameter list in the closure passed to seq.map { }:

func sequentialFlatten(seq : S, children : S.Generator.Element -> S) -> AnySequence {
    return AnySequence {
        seq.map { elem -> AnyGenerator in
            let firstGen = AnyGenerator([elem].generate())
            let secondGen = sequentialFlatten(children(elem), children: children).generate()
            return firstGen + secondGen
        }.reduce(AnyGenerator { nil }, combine: +)
    }
}


This allows to use GeneratorOfOne() as the first generator
(and get rid of another AnyGenerator wrapping):

func sequentialFlatten(seq : S, children : S.Generator.Element -> S) -> AnySequence {
    return AnySequence {
        seq.map { elem -> AnyGenerator in
            let firstGen = GeneratorOfOne(elem)
            let secondGen = sequentialFlatten(children(elem), children: children).generate()
            return firstGen + secondGen
        }.reduce(AnyGenerator { nil }, combine: +)
    }
}


which reduces the execution time to 2.1 sec.

Your approach

func +(lhs: G, rhs: G) -> AnyGenerator {
    var leftGen = lhs
    var rightGen = rhs

    return AnyGenerator {
        leftGen.next() ?? rightGen.next()
    }
}


is actually invalid, as the GeneratorType documentation states
for the next() method:

/// - Requires: `..., and no preceding call to `self.next()`
///   has returned `nil`.


So once the left generator is exhausted, we must not call its
next() method again. (Even if many generators tolerate that

and return nil again.)

However, using the ideas from https://stackoverflow.com/a/37665583/1187415
and http://ericasadun.com/2016/06/06/sneaky-swift-tricks-the-fake-boolean/
the + operator can be implemented as a sequence of
nil-coalescing operators, where the purpose of the middle closure
expression is to execute a statement if the previous expression "failed":

func +(lhs: G1, rhs: G2) -> AnyGenerator {
    var leftGen: G1? = lhs
    var rightGen = rhs

    return AnyGenerator {
        return leftGen?.next() 
            ?? { _ -> G1.Element? in leftGen = nil; return nil }()
            ?? rightGen.next()
    }
}


Interestingly, this is a bit faster: 1.9 sec.

More improvements might be possible but that is what I have so far.
However, the recursive method from your answer https://codereview.stackexchange.com/a/131527/35991
is still much faster, it takes only 0.9 sec.

Code Snippets

let d1 = NSDate()
let gen = sequentialFlatten([0], children: { n in n < 100_000 ? [2*n+1, 2*n+2] : []})
var c = 0
for _ in gen { c += 1 }
let d2 = NSDate()
print(c, d2.timeIntervalSinceDate(d1))
func sequentialFlatten<S : SequenceType>(seq : S, children : S.Generator.Element -> S) -> AnySequence<S.Generator.Element> {
    return AnySequence {
        seq.map {
            let firstGen = AnyGenerator([$0].generate())
            var secondGen = Lazy(sequentialFlatten(children($0), children: children).generate())
            return firstGen + AnyGenerator {
                secondGen.val.next()
            }
        }.reduce(AnyGenerator { nil }, combine: +)
    }
}
func sequentialFlatten<S : SequenceType>(seq : S, children : S.Generator.Element -> S) -> AnySequence<S.Generator.Element> {
    return AnySequence {
        seq.map {
            let firstGen = AnyGenerator([$0].generate())
            let secondGen = sequentialFlatten(children($0), children: children).generate()
            return firstGen + secondGen
        }.reduce(AnyGenerator { nil }, combine: +)
    }
}
func +<G1: GeneratorType, G2: GeneratorType where G1.Element == G2.Element>(lhs: G1, rhs: G2) -> AnyGenerator<G1.Element>
func sequentialFlatten<S : SequenceType>(seq : S, children : S.Generator.Element -> S) -> AnySequence<S.Generator.Element> {
    return AnySequence {
        seq.map { elem -> AnyGenerator<S.Generator.Element> in
            let firstGen = AnyGenerator([elem].generate())
            let secondGen = sequentialFlatten(children(elem), children: children).generate()
            return firstGen + secondGen
        }.reduce(AnyGenerator { nil }, combine: +)
    }
}

Context

StackExchange Code Review Q#131618, answer score: 2

Revisions (0)

No revisions yet.