patternpythonMinor
Functions to merge dictionaries with a comparison
Viewed 0 times
mergewithdictionariescomparisonfunctions
Problem
I have several functions for merging some dictionaries but over time I created a more general function that would make all these others obsolete if it weren't slower.
I have the specialized (and several like it) functions that look like this:
There are some more for replacing the key if it is higher/shorter/longer/... all look exactly the same except for the
Then I thoug
I have the specialized (and several like it) functions that look like this:
def merge_keep_lowest(dict1, dict2, *dicts):
"""
Merge an arbitary number of :py:class:`dict`-like objects and keeps the
**lowest** encountered value for each key.
Parameters
----------
*dicts : dict-like
The dictionaries to be merged. At least two must be given.
Returns
-------
result : any type
The merged dictionaries.
"""
# Copy the first dict so the result's class and its properties are defined
result = dict1.copy()
# We only want to iterate once so combine the second and the other dicts.
dicts = (dict2,) + dicts
# Now iterate over each dictionary since there is no directly useable
# dict-method for this kind of operation
for d in dicts:
# Now iterate over each key of this dict.
# This way is faster than "for kw in d.keys()".
for kw in d:
# One could also use "try ... except KeyError ..." here instead of
# the "if kw in result". That would be a bit faster if all dicts
# contained mostly the same keys ... but since contain checks with
# dictionaries are relativly cheap - so it doesn't make a huge
# difference
if kw in result:
# The key was already present in the result so compare it and
# replace it if it is smaller.
if d[kw] < result[kw]:
result[kw] = d[kw]
else:
# If the key was not yet in the result dict just initialize it
result[kw] = d[kw]
return resultThere are some more for replacing the key if it is higher/shorter/longer/... all look exactly the same except for the
if d[kw] < result[kw]: line.Then I thoug
Solution
The version taking
There's no reason to force at least two parameters. Just use
Note that
Further, you should really have a
but then also
A
and any comparator
For example, for a comparator of
one has
or, simply stated,
func is clearly the nicest; the version which always chooses the lowest is only really better if speed is actually needed.There's no reason to force at least two parameters. Just use
*dicts and default to {}. It's simpler and nicer._by_func is just _by. *_by_method is horrible and should be avoided - it's not even more general and it's all stringy.Note that
lambda x, y: True if x < y else False is just lambda x, y: x < y is just operator.lt.Further, you should really have a
fold function, not a comparator, so you can do stuff likedef merge_keep_lowest(*dicts):
return merge_dicts_by(*dicts, fold=min)but then also
def merge_counts(*dicts):
return merge_dicts_by(dicts, fold=sum)A
fold would look likeresult[kw] = fold(result[kw], d[kw])and any comparator
comp(new, old) can be turned into a fold withlambda old, new: new if comp(new, old) else oldFor example, for a comparator of
lambda new, old: new <= oldone has
lambda old, new: new if new <= old else oldor, simply stated,
minCode Snippets
def merge_keep_lowest(*dicts):
return merge_dicts_by(*dicts, fold=min)def merge_counts(*dicts):
return merge_dicts_by(dicts, fold=sum)result[kw] = fold(result[kw], d[kw])lambda old, new: new if comp(new, old) else oldlambda new, old: new <= oldContext
StackExchange Code Review Q#122406, answer score: 4
Revisions (0)
No revisions yet.