HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Python list dictionary items round robin mixing

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
dictionaryrobinitemspythonroundmixinglist

Problem

I have a list of dictionaries, which I want to make round robin mixing.

sample = [
    {'source': 'G', '"serial"': '0'},
    {'source': 'G', '"serial"': '1'},
    {'source': 'G', '"serial"': '2'},
    {'source': 'P', '"serial"': '30'},
    {'source': 'P', '"serial"': '0'},
    {'source': 'P', '"serial"': '1'},
    {'source': 'P', '"serial"': '2'},
    {'source': 'P', '"serial"': '3'},
    {'source': 'T', '"serial"': '2'},
    {'source': 'T', '"serial"': '3'}
]


I want result as below:

sample_solved = [
    {'source': 'G', '"serial"': '0'},
    {'source': 'P', '"serial"': '30'},
    {'source': 'T', '"serial"': '2'},
    {'source': 'G', '"serial"': '1'},
    {'source': 'P', '"serial"': '1'},
    {'source': 'T', '"serial"': '3'},
    {'source': 'G', '"serial"': '2'},
    {'source': 'P', '"serial"': '0'},
    {'source': 'P', '"serial"': '2'},
    {'source': 'P', '"serial"': '3'}
]


The way I solved it as follows:

def roundrobin(*iterables):
    # took from here https://docs.python.org/3/library/itertools.html#itertools-recipes

    "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
    # Recipe credited to George Sakkis

    pending = len(iterables)
    nexts = cycle(iter(it).__next__ for it in iterables)
    while pending:
        try:
            for next in nexts:
                yield next()
        except StopIteration:
            pending -= 1
            nexts = cycle(islice(nexts, pending))

def solve():
    items_by_sources = collections.defaultdict(list)

    for item in sample:
         items_by_sources[item["source"]].append(item)

    t, p, g = items_by_sources.values()

    print(list(roundrobin(t, p, g)))


Using python default dict to separate the items by source and then using roundrobin solution which I got from python docs.

How can make the solution more pythonic or improved?

Solution

Using the roundrobin() recipe from itertools is already pythonic. The solve() method could be replaced with more use of itertools.

In particular itertools.groupby() would do the same as your defaultdict and for loop:

>>> import operator as op
>>> g, p, t = [list(v) for k, v in groupby(sample, key=op.itemgetter('source'))]
>>> list(roundrobin(g, p, t))
[{'"serial"': '0', 'source': 'G'},
 {'"serial"': '30', 'source': 'P'},
 {'"serial"': '2', 'source': 'T'},
 {'"serial"': '1', 'source': 'G'},
 {'"serial"': '0', 'source': 'P'},
 {'"serial"': '3', 'source': 'T'},
 {'"serial"': '2', 'source': 'G'},
 {'"serial"': '1', 'source': 'P'},
 {'"serial"': '2', 'source': 'P'},
 {'"serial"': '3', 'source': 'P'}]


You don't really need to unpack as you can make the call to roundrobin() using *, e.g:

>>> x = [list(v) for k, v in it.groupby(sample, key=op.itemgetter('source'))]
>>> list(roundrobin(*x))
[{'"serial"': '0', 'source': 'G'},
 {'"serial"': '30', 'source': 'P'},
...


Note roundrobin() could be rewritten using itertools.zip_longest(), which should be faster for near equal sized iterables e.g.:

def roundrobin(*iterables):
    sentinel = object()
    return (a for x in it.zip_longest(*iterables, fillvalue=sentinel) 
            for a in x if a != sentinel)


Did a quick run of a 10000 random items in sample and found the recipe surprisingly slow (need to figure out why):

In [11]: sample = [{'source': random.choice('GPT'), 'serial': random.randrange(100)} for _ in range(10000)]
In [12]: x = [list(v) for k, v in it.groupby(sample, key=op.itemgetter('source'))]
In [13]: %timeit list(roundrobin_recipe(*x))
1 loop, best of 3: 1.48 s per loop
In [14]: %timeit list(roundrobin_ziplongest(*x))
100 loops, best of 3: 4.12 ms per loop
In [15]: %timeit TW_zip_longest(*x)
100 loops, best of 3: 6.36 ms per loop
In [16]: list(roundrobin_recipe(*x)) == list(roundrobin_ziplongest(*x))
True
In [17]: list(roundrobin_recipe(*x)) == TW_zip_longest(*x)
True

Code Snippets

>>> import operator as op
>>> g, p, t = [list(v) for k, v in groupby(sample, key=op.itemgetter('source'))]
>>> list(roundrobin(g, p, t))
[{'"serial"': '0', 'source': 'G'},
 {'"serial"': '30', 'source': 'P'},
 {'"serial"': '2', 'source': 'T'},
 {'"serial"': '1', 'source': 'G'},
 {'"serial"': '0', 'source': 'P'},
 {'"serial"': '3', 'source': 'T'},
 {'"serial"': '2', 'source': 'G'},
 {'"serial"': '1', 'source': 'P'},
 {'"serial"': '2', 'source': 'P'},
 {'"serial"': '3', 'source': 'P'}]
>>> x = [list(v) for k, v in it.groupby(sample, key=op.itemgetter('source'))]
>>> list(roundrobin(*x))
[{'"serial"': '0', 'source': 'G'},
 {'"serial"': '30', 'source': 'P'},
...
def roundrobin(*iterables):
    sentinel = object()
    return (a for x in it.zip_longest(*iterables, fillvalue=sentinel) 
            for a in x if a != sentinel)
In [11]: sample = [{'source': random.choice('GPT'), 'serial': random.randrange(100)} for _ in range(10000)]
In [12]: x = [list(v) for k, v in it.groupby(sample, key=op.itemgetter('source'))]
In [13]: %timeit list(roundrobin_recipe(*x))
1 loop, best of 3: 1.48 s per loop
In [14]: %timeit list(roundrobin_ziplongest(*x))
100 loops, best of 3: 4.12 ms per loop
In [15]: %timeit TW_zip_longest(*x)
100 loops, best of 3: 6.36 ms per loop
In [16]: list(roundrobin_recipe(*x)) == list(roundrobin_ziplongest(*x))
True
In [17]: list(roundrobin_recipe(*x)) == TW_zip_longest(*x)
True

Context

StackExchange Code Review Q#156729, answer score: 5

Revisions (0)

No revisions yet.