patternpythonMinor

Group a list of `dict` by all keys except one

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

dictgroupallkeysonelistexcept

Problem

I have a function that I use to convert a list of dictionaries into a list of tuples, similar to itertools.groupby() would do in an ideal world. The goal is to make a list of {unique-dict} => [list of values]. I'm having a hard time explaining it, so I hope the example makes it clear.

I couldn't find a thing on google, so I wrote it pretty quickly, and I am wondering if there is a better way to do this (or a standard library even)

Naming things is hard... what is a better one?

This is a task I performed fairly often when receiving user input

The keys in each dict may not be consistent (some may be missing)

It is OK to destroy or mangle the original input list & items

Validation is done before calling, so 'key_field' is always present

It is expected that values can be duplicated in the list.

{ status: 1 } => [1,2,3,4,1,2] is ok, if the values are actually present twice.

Simple little python function:

def group_by_excluding_key(list_of_dicts, key_field):
    """
    Takes a list of `dict` items and groups by ALL KEYS in the dict EXCEPT the key_field.
    :param list_of_dicts: List of dicts to group
    :param key_field: key field in dict which should be excluded from the grouping
    """

    output = []

    for item in list_of_dicts:
        found = False
        item_key = item.pop(key_field)

        for existing_group, found_keys in output:
            if existing_group.viewitems() == item.viewitems():
                found_keys.append(item_key)
                found = True
                break

        if not found:
            output.append((item, [item_key]))

    return output

Example Input/Output

```
from pprint import pprint

data = [
{'id': 1, 'status': 1, 'product': 1},
{'id': 2, 'status': 1, 'product': 1},
{'id': 7, 'status': 1, 'product': 2},
{'id': 9, 'status': 1, 'product': 2},
{'id': 3, 'status': 1, 'product': 1},
{'id': 4, 'status': 1, 'product': 1},
{'id': 8, 'status': 1, 'p

Solution

Your code is pretty good, there is one thing that I would add,
the for-else keyword, as this gets rid of the found variable.
Which honestly is just noise.
This is as if a for loop runs completely without breaking then the else will run too. But if it breaks then it won't run the else.
This can leave you with:

def group_by_excluding_key(list_of_dicts, key_field):
    """
    Takes a list of `dict` items and groups by ALL KEYS in the dict EXCEPT the key_field.
    :param list_of_dicts: List of dicts to group
    :param key_field: key field in dict which should be excluded from the grouping
    """
    output = []
    for item in list_of_dicts:
        item_key = item.pop(key_field)
        for existing_group, found_keys in output:
            if existing_group.viewitems() == item.viewitems():
                found_keys.append(item_key)
                break
        else:
            output.append((item, [item_key]))
    return output

Other than that your code is good.

But if I were to were to write this, I'd prefer a very small solution.
Lets say dicts are hash able, what you want is a dictionary that has the modified item as the key, and the popped item_key as the value.
This obviously has two down-sides, it's not ordered, and dicts aren't hash able.
Both easily solved with collections.OrderedDict and tuple(dict.items()).
And so can result in:

from collections import OrderedDict

def group_by_excluding_key(list_of_dicts, key_field):
    """
    Takes a list of `dict` items and groups by ALL KEYS in the dict EXCEPT the key_field.
    :param list_of_dicts: List of dicts to group
    :param key_field: key field in dict which should be excluded from the grouping
    """
    output = OrderedDict()
    for item in list_of_dicts:
        key = item.pop(key_field)
        output.setdefault(tuple(item.items()), []).append(key)
    return [(dict(key), value) for key, value in output.items()]

This has the benefit of moving the for loop into the OrderedDict, and possibly getting \$O(1)\$ key lookup, but requires you to change the type of all the keys, twice.

I know you didn't ask for a performance review, but the performance difference between my code and your code can be tested with the following. The comments are my functions run time over yours as a percentage, so if mine took 0.8s and yours 3.3s then it'll be 24%, followed by how long it took my function to run.

from timeit import timeit
from itertools import count

# 240%, 0.1s
c = count(1)
l = [{'status': 1, 'product': i, 'id': next(c)} for i in range(10)]
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key_dict as fn', number=1000))
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key as fn', number=1000))

# 25%, 0.8s
c = count(1)
l = [{'status': 1, 'product': i, 'id': next(c)} for i in range(100)]
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key_dict as fn', number=1000))
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key as fn', number=1000))

# 28%, 0.8s
c = count(1)
l = [{'status': i, 'product': j, 'id': next(c)} for i in range(10) for j in range(10)]
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key_dict as fn', number=1000))
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key as fn', number=1000))

# 0.4%, 0.9s
c = count(1)
l = [{'status': i, 'product': j, 'id': next(c)} for i in range(100) for j in range(100)]
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key_dict as fn', number=10))
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key as fn', number=10))

Code Snippets

def group_by_excluding_key(list_of_dicts, key_field):
    """
    Takes a list of `dict` items and groups by ALL KEYS in the dict EXCEPT the key_field.
    :param list_of_dicts: List of dicts to group
    :param key_field: key field in dict which should be excluded from the grouping
    """
    output = []
    for item in list_of_dicts:
        item_key = item.pop(key_field)
        for existing_group, found_keys in output:
            if existing_group.viewitems() == item.viewitems():
                found_keys.append(item_key)
                break
        else:
            output.append((item, [item_key]))
    return output

from collections import OrderedDict

def group_by_excluding_key(list_of_dicts, key_field):
    """
    Takes a list of `dict` items and groups by ALL KEYS in the dict EXCEPT the key_field.
    :param list_of_dicts: List of dicts to group
    :param key_field: key field in dict which should be excluded from the grouping
    """
    output = OrderedDict()
    for item in list_of_dicts:
        key = item.pop(key_field)
        output.setdefault(tuple(item.items()), []).append(key)
    return [(dict(key), value) for key, value in output.items()]

from timeit import timeit
from itertools import count

# 240%, 0.1s
c = count(1)
l = [{'status': 1, 'product': i, 'id': next(c)} for i in range(10)]
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key_dict as fn', number=1000))
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key as fn', number=1000))

# 25%, 0.8s
c = count(1)
l = [{'status': 1, 'product': i, 'id': next(c)} for i in range(100)]
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key_dict as fn', number=1000))
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key as fn', number=1000))

# 28%, 0.8s
c = count(1)
l = [{'status': i, 'product': j, 'id': next(c)} for i in range(10) for j in range(10)]
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key_dict as fn', number=1000))
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key as fn', number=1000))

# 0.4%, 0.9s
c = count(1)
l = [{'status': i, 'product': j, 'id': next(c)} for i in range(100) for j in range(100)]
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key_dict as fn', number=10))
print(timeit('fn({!r}, "id")'.format(l), 'from __main__ import group_by_excluding_key as fn', number=10))

Context

StackExchange Code Review Q#142134, answer score: 4

Revisions (0)

No revisions yet.