HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Intersection of subset between two lists of dicts

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
dictsintersectiontwobetweenlistssubset

Problem

I want to find the intersection of two lists of dicts.

Note that equality—i.e.: what is to appear in the intersection—is given by which keys they must have equal. See the obj_equal_on function.

def obj_equal_on(obj0, obj1, keys):
    for key in keys:
        if obj0.get(key, False) != obj1.get(key, None):
            return False
    return True

def l_of_d_intersection(ld0, ld1, keys):
    """ Find intersection between two lists of dicts.
          At best, will return `ld0` in full. At worst: [].

    :param ld0 :type [DictType]
    :param ld1 :type [DictType]
    :param keys :type ListType

    :returns intersection of ld0 and ld1 where key is equal.
    """

    return [obj1 for obj0 in ld1 for obj1 in ld0
            if obj_equal_on(obj0, obj1, keys)]


Basic testing (self-contained for SE, usually I'd subclass unittest.TestCase):

def run_example(ld0, ld1, keys, expected_result):
    result = tuple(l_of_d_intersection(ld0, ld1, keys))
    assert result == expected_result, '{0} != {1}'.format(result,expected_result)

def main():
    run_example(
        ld0=[{'foo': 'bar', 'haz': 'more'},
             {'can': 'haz', 'more': 'haz'},
             {'foo': 'jar', 'more': 'fish'}],
        ld1=[{'foo': 'bar'},
             {'can': 'haz'},
             {'foo': 'foo'}],
        keys=('foo',),
        expected_result=({'foo': 'bar', 'haz': 'more'},)
    )

    run_example(
        ld0=[
            {'orange': 'black', 'blue': 'green', 'yellow': 'red'},
            {'blue': 'yellow'},
            {'orange': 'red', 'yellow': 'blue'}
        ],
        ld1=[
            {'orange': 'black', 'yellow': 'red'},
            {'blue': 'yellow'},
            {'orange': 'red', 'yellow': 'blue'}
        ],
        keys=('orange', 'yellow'),
        expected_result=({'orange': 'black', 'blue': 'green', 'yellow': 'red'},
                         {'orange': 'red', 'yellow': 'blue'})
    )


My example works, but my code is rather horribly inefficient.

Solution

Although a list comprehension is generally an efficient way to go about things, in this case it means you check every obj0 against every obj1, even if you've already found a match. To avoid that:

out = []
for obj0 in ld0:
    for obj1 in ld1:
        if obj_equal_on(obj0, obj1, keys):
            out.append(obj0)
            break
return out


Beyond that, you could consider using a custom immutable class rather than dictionaries, implementing __eq__ and __hash__ and using sets rather than lists to efficiently determine the intersection.

Code Snippets

out = []
for obj0 in ld0:
    for obj1 in ld1:
        if obj_equal_on(obj0, obj1, keys):
            out.append(obj0)
            break
return out

Context

StackExchange Code Review Q#92229, answer score: 2

Revisions (0)

No revisions yet.