HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Python generator function that yields combinations of elements in a sequence sorted by subset order

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
orderelementscombinationsfunctiongeneratorsequenceyieldspythonthatsorted

Problem

In Python, itertools.combinations yields combinations of elements in a sequence sorted by lexicographical order. In the course of solving certain math problems, I found it useful to write a function, combinations_by_subset, that yields these combinations sorted by subset order (for lack of a better name).

For example, to list all 3-length combinations of the string 'abcde':

>>> [''.join(c) for c in itertools.combinations('abcde', 3)]
['abc', 'abd', 'abe', 'acd', 'ace', 'ade', 'bcd', 'bce', 'bde', 'cde']

>>> [''.join(c) for c in combinations_by_subset('abcde', 3)]
['abc', 'abd', 'acd', 'bcd', 'abe', 'ace', 'bce', 'ade', 'bde', 'cde']


Formally, for a sequence of length \$n\$, we have \$\binom{n}{r}\$ combinations of length \$r\$, where \$\binom{n}{r} = \frac{n!}{r! (n - r)!}\$

The function combinations_by_subset yields combinations in such an order that the first \$\binom{k}{r}\$ of them are the r-length combinations of the first k elements of the sequence.

In our example above, the first \$\binom{3}{3} = 1\$ combination is the 3-length combination of 'abc' (which is just 'abc'); the first \$\binom{4}{3} = 4\$ combinations are the 3-length combinations of 'abcd' (which are 'abc', 'abd', 'acd', 'bcd'); etc.

My first implementation is a simple generator function:

def combinations_by_subset(seq, r):
    if r:
        for i in xrange(r - 1, len(seq)):
            for cl in (list(c) for c in combinations_by_subset(seq[:i], r - 1)):
                cl.append(seq[i])
                yield tuple(cl)
    else:
        yield tuple()


For fun, I decided to write a second implementation as a generator expression and came up with the following:

def combinations_by_subset(seq, r):
    return (tuple(itertools.chain(c, (seq[i], ))) for i in xrange(r - 1, len(seq)) for c in combinations_by_subset(seq[:i], r - 1)) if r else (() for i in xrange(1))


My questions are:

  • Which function definition is preferable? (I prefer the generator function over

Solution

Rather converting from tuple to list and back again, construct a new tuple by adding to it.

def combinations_by_subset(seq, r):
    if r:
        for i in xrange(r - 1, len(seq)):
            for cl in combinations_by_subset(seq[:i], r - 1):
                yield cl + (seq[i],)
    else:
        yield tuple()

Code Snippets

def combinations_by_subset(seq, r):
    if r:
        for i in xrange(r - 1, len(seq)):
            for cl in combinations_by_subset(seq[:i], r - 1):
                yield cl + (seq[i],)
    else:
        yield tuple()

Context

StackExchange Code Review Q#1419, answer score: 7

Revisions (0)

No revisions yet.