HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Split a list using another list whose items are the split lengths

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
thewhosearesplititemsanotherusinglengthslist

Problem

I want to split a list using another list which contains the lengths of each split.

E.g:

>>> print list(split_by_lengths(list('abcdefg'), [2,1]))
... [['a', 'b'], ['c'], ['d', 'e', 'f', 'g']]
>>> print list(split_by_lengths(list('abcdefg'), [2,2]))
... [['a', 'b'], ['c', 'd'], ['e', 'f', 'g']]
>>> print list(split_by_lengths(list('abcdefg'), [2,2,6]))
... [['a', 'b'], ['c', 'd'], ['e', 'f', 'g']]
>>> print list(split_by_lengths(list('abcdefg'), [1,10]))
... [['a'], ['b', 'c', 'd', 'e', 'f', 'g']]
>>> print list(split_by_lengths(list('abcdefg'), [2,2,6,5]))
... [['a', 'b'], ['c', 'd'], ['e', 'f', 'g']]


As you can notice, if the lengths list does not cover all the list I append the remaining elements as an additional sublist. Also, I want to avoid empty lists at the end in the cases that the lengths list produces more elements that are in the list to split.

I already have a function that works as I want:

def take(n, iterable):
"Return first n items of the iterable as a list"
return list(islice(iterable, n))

def split_by_lengths(list_, lens):
li = iter(list_)
for l in lens:
elems = take(l,li)
if not elems:
break
yield elems
else:
remaining = list(li)
if remaining:
yield remaining


But I wonder if there is a more pythonic way to write a function such that one.

Note: I grabbed take(n, iterable) from Itertools Recipes

Note: This question is a repost from a stackoverflow question

Solution

This is my first answer that I gave on stackoverflow:

from itertools import islice

def split_by_lengths(seq, num):
    it = iter(seq)
    for n in num:
        out = list(islice(it, n))
        if out:
            yield out
        else:
            return   #StopIteration 
    remain = list(it)
    if remain:
        yield remain


Here I am not using a for-else loop because we can end the generator by using a simple return statement. And IMO there's no need to define an extra take function just to slice an iterator.

Second answer:

This one is slightly different from the first one because this won't short-circuit as soon as one of the length exhausts the iterator. But it is more compact compared to my first answer.

def split_by_lengths(seq, num):
    it = iter(seq)
    out =  [x for x in (list(islice(it, n)) for n in num) if x]
    remain = list(it)
    return out if not remain else out + [remain]

Code Snippets

from itertools import islice

def split_by_lengths(seq, num):
    it = iter(seq)
    for n in num:
        out = list(islice(it, n))
        if out:
            yield out
        else:
            return   #StopIteration 
    remain = list(it)
    if remain:
        yield remain
def split_by_lengths(seq, num):
    it = iter(seq)
    out =  [x for x in (list(islice(it, n)) for n in num) if x]
    remain = list(it)
    return out if not remain else out + [remain]

Context

StackExchange Code Review Q#46246, answer score: 5

Revisions (0)

No revisions yet.