HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Rectangularize a list-of-lists structure

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
liststructurerectangularizelists

Problem

Considering I have an iterated "lists of lists ..." (up to 4 dimensions), now I wish to make the full list "rectangular" (or well each dimension equal in size). For a 4 dimensional case I've written out a specific version:

x=[[[[4,3,4,5],[1],[2,2]],[[3,6,7],[2,3],[1]]],[[[1]]]]

length0 = len(x)
length1 = len(sorted(x,key=len, reverse=True)[0])
length2 = 0
length3 = 0
for xi in x:
    lengthi = len(sorted(xi,key=len, reverse=True)[0])
    for xii in xi:
        lengthii = len(sorted(xii,key=len, reverse=True)[0])
        length3 = max(lengthii, length3)
    length2 = max(lengthi, length2)
tlist3 = [None]
tlist2 = [tlist3 * length3]
tlist1 = [tlist2 * length2]

for xi in x:
    for xii in xi:
        for xiii in xii:
            xiii.extend(tlist3*(length3 - len(xiii) ))
        xii.extend(tlist2*(length2 - len(xii)))
    xi.extend(tlist1 * (length1 - len(xi)))

print(x)


It works, but that's everything I can say about this code (it doesn't look clean, nor is it scalable to more dimensions easily).

Ideally I would've a function that goes up to N dimensions into a given list, and rectangulars everything

As a side note: this is for insertion into a NumPy array.

Solution

-
You can find a maximum without sorting. Instead of

len(sorted(x,key=len, reverse=True)[0])


you can use either of these:

len(max(x, key=len))
max(len(s) for s in x)


For a more generic solution I propose this:

import itertools

def rectangularize(nested_list, dimensions, fill_value=None):
    flat = nested_list
    for dim in range(dimensions-2, -1, -1):
        size = max(len(s) for s in flat)
        for s in flat:
            s.extend([] if dim else fill_value for _ in range(size - len(s)))
        flat = list(itertools.chain.from_iterable(flat))

x=[[[[4,3,4,5],[1],[2,2]],[[3,6,7],[2,3],[1]]],[[[1]]]]
rectangularize(x, 4)
print(x)


To explain:

  • The flat list is a flattened view of one dimension of nested_list. It contains references to sublists of nested_list; extending the sublists through flat mutates nested_list.



  • flat = list(itertools.chain.from_iterable(flat)) advances flat to the next dimension by concatenating the sublists.



  • The main loop runs dimensions-1 times because the outermost list never needs to be extended.



  • I'm using range(dimensions-2, -1, -1) to make dim == 0 in the final iteration where a different fill value is needed, so I can use a clean if dim else to deal with that.

Code Snippets

len(sorted(x,key=len, reverse=True)[0])
len(max(x, key=len))
max(len(s) for s in x)
import itertools

def rectangularize(nested_list, dimensions, fill_value=None):
    flat = nested_list
    for dim in range(dimensions-2, -1, -1):
        size = max(len(s) for s in flat)
        for s in flat:
            s.extend([] if dim else fill_value for _ in range(size - len(s)))
        flat = list(itertools.chain.from_iterable(flat))

x=[[[[4,3,4,5],[1],[2,2]],[[3,6,7],[2,3],[1]]],[[[1]]]]
rectangularize(x, 4)
print(x)

Context

StackExchange Code Review Q#85038, answer score: 2

Revisions (0)

No revisions yet.