HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Modifying nested dictionaries in Python

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
nesteddictionariespythonmodifying

Problem

My task was to take a configuration specified as a dictionary and run some function with it (the actual function is not relevant here). After implementing that, I was asked to allow some "sweeping" of values for the parameters.

After some discussion it was decided to keep the dictionary structure, but allow the user to specify "swept keys", which the code will have to "decompose" into separate "no-sweeping" dictionaries.

For better clarification of my task, actual example can be seen in the doctest part of the function detonate_sweeps below.

As I am relatively new to Python I am looking for any comment/idea/improvement about my code, including both complexity and style issues. I am eager to improve myself, so please feel free to comment.

One extra question is more dedicated to doctest. I am working with PyCharm, and although I've prepared the doctest quotes I have no idea how to run them (hence they may contain typos...). Does anyone have any tips as for how do I do it?

```
import itertools
import copy

def dict_2_list_of_keys(d, l, loc):
""" Return a list containing lists describing the dictionary nodes' paths

>>> dict_2_list_of_keys({'k1': 'v1', 'k2': {'k21': 'v21', 'k22': 'v22'}})
[['k1'], ['k2', 'k21'], ['k2', 'k22']]
"""
for k in iter(d):
loc.append(k)
l.append(loc * 1)
if isinstance(d[k], dict):
dict_2_list_of_keys(d[k], l, loc)
loc.pop()
return l

def list_of_keys_2_dict_less_sweep(orig_d, swept_key):
""" Reduce a SINGLE '_sweep' parameter

>>> list_of_keys_2_dict_less_sweep({'k1_sweep': ['v1a', 'v1b'] ,
'k2': {'k21_sweep': ['v21a', 'v21b'],
'k22': 'v22'}}, 'k1_sweep')
[{'k1': 'v1a' ,'k2': {'k21_sweep': ['v21a', 'v21b'],'k22': 'v22'}},
{'k1': 'v1b' ,'k2': {'k21_sweep': ['v21a', 'v21b'],'k22': 'v22'}}]
"""
#TODO: assert swept_key has a list as value
swept_key_values

Solution


  1. General comments



-
To run the doctests in file.py, just run:

python -m doctest file.py


(See the documentation for the doctest module where this is explained.)

-
I said that I didn't understand your question (and indeed I still don't) and you replied, "I really can't see what's wrong with my question." Well, obviously you understand it—you've discussed the problem with your colleague and you've written a solution. But take it from me, your description is not clear to someone like me who has no idea what you are on about.

For example, what is the "some function" in the first sentence and why do you mention it if it is not relevant? In one place you write about "decomposing" "swept keys" and in another you write about "detonating" them. Are these two different procedures or are they two names for the same procedure? And so on.

It's not easy to write clear descriptions of complex functions. But it helps if you listen to your readers.

  1. The function dict_2_list_of_keys



-
It took me a while to figure out that "2" here is not actually a number but a pun for "to". Write to instead of 2: it's only one character longer.

-
The function does not actually return a list of keys as suggested by the name: it actually returns a list of lists of keys.

-
The docstring is quite obscure. It returns a "list containing lists describing the dictionary nodes' paths". When you write containing, do you mean containing all of them? What is a dictionary node? (Dictionaries have keys and values but not, as far as I know, nodes.) What is a dictionary nodes' path?

-
There's no point writing:

for k in iter(d):


when you could just write:

for k in d:


-
But having written that, since you look up the value d[k], it would be better to iterate over the keys and values of the dictionary, and avoid the lookups:

for k, v in d.items():


-
Writing loc * 1 is a very obscure way to take a copy of a list. It would be clearer if you wrote loc.copy() (Python 3) or copy.copy(loc) (Python 2).

-
The doctest fails:

Failed example:
    dict_2_list_of_keys({'k1': 'v1', 'k2': {'k21': 'v21', 'k22': 'v22'}})
Exception raised:
    Traceback (most recent call last):
      File "/python3.4/doctest.py", line 1324, in __run
        compileflags, 1), test.globs)
      File "", line 1, in 
        dict_2_list_of_keys({'k1': 'v1', 'k2': {'k21': 'v21', 'k22': 'v22'}})
    TypeError: dict_2_list_of_keys() missing 2 required positional arguments: 'l' and 'loc'


-
Even with the missing arguments added, the doctest still fails:

Failed example:
    dict_2_list_of_keys({'k1': 'v1', 'k2': {'k21': 'v21', 'k22': 'v22'}}, [], [])
Expected:
    [['k1'], ['k2', 'k21'], ['k2', 'k22']]
Got:
    [['k1'], ['k2'], ['k2', 'k21'], ['k2', 'k22']]


Is the doctest correct and the code wrong, or is it the other way round? The docstring is not written clearly enough for me to tell! So I'm going to assume that the doctest is right and the code is wrong. At least I understand the doctest.

So the way to fix this is to write the loop like this:

for k, v in d.items():
    loc.append(k)
    if isinstance(v, dict):
        dict_2_list_of_keys(v, l, loc)
    else:
        l.append(loc.copy())
    loc.pop()


-
But even with that fix, the doctest still doesn't pass:

Failed example:
    dict_2_list_of_keys({'k1': 'v1', 'k2': {'k21': 'v21', 'k22': 'v22'}}, [], [])
Expected:
    [['k1'], ['k2', 'k21'], ['k2', 'k22']]
Got:
    [['k2', 'k22'], ['k2', 'k21'], ['k1']]


That's because iteration over a dictionary is not guaranteed to happen in any particular order. So in order to have a consistently reproducible doctest, you should sort the results.

-
When producing a series of results in Python, it is usually more convenient to generate the results using the yield statement, rather than appending them to a list as you do. This avoids the need to pass in the l argument.

Putting all this together, I'd write the function like this:

def key_sequences(d):
    """Given a (possibly nested) dictionary d, generate tuples giving the
    sequence of keys needed to reach each non-dictionary value in d
    (and its nested sub-dictionaries, if any).

    For example, given the dictionary:

    >>> d = {1: 0, 2: {3: 0, 4: {5: 0}}, 6: 0}

    there are non-dictionary values at d[1], d[2][3], d[2][4][5], and
    d[6], and so:

    >>> sorted(key_sequences(d))
    [(1,), (2, 3), (2, 4, 5), (6,)]

    """
    for k, v in d.items():
        if isinstance(v, dict):
            for seq in key_sequences(v):
                yield (k,) + seq
        else:
            yield (k,)

Code Snippets

python -m doctest file.py
for k in iter(d):
for k in d:
for k, v in d.items():
Failed example:
    dict_2_list_of_keys({'k1': 'v1', 'k2': {'k21': 'v21', 'k22': 'v22'}})
Exception raised:
    Traceback (most recent call last):
      File "/python3.4/doctest.py", line 1324, in __run
        compileflags, 1), test.globs)
      File "<doctest cr55588.dict_2_list_of_keys[0]>", line 1, in <module>
        dict_2_list_of_keys({'k1': 'v1', 'k2': {'k21': 'v21', 'k22': 'v22'}})
    TypeError: dict_2_list_of_keys() missing 2 required positional arguments: 'l' and 'loc'

Context

StackExchange Code Review Q#55588, answer score: 6

Revisions (0)

No revisions yet.