patternpythonMinor
Numerate every item in dict
Viewed 0 times
itemdicteverynumerate
Problem
I have a .json file with a complex structure: dict in a dict, it has no constant structure and is dynamically changed.
The goal is to make new dict where the keys are numbers of hierarchy and values must be strings that are composed of all previous steps like
I wrote a script, and it works. But as I am new to Python and in programming, please check it out and correct me in order to code may look and perform more elegantly and in a pythonic way.
Also I would like to decrease time of execution because the input .json file is going to be quite big.
Sample cat.json file:
Script:
```
#!/usr/bin/env python -tt
# -- coding: utf-8 --
import json
filetoread='cat.json'
def load_existed(filetoread):
try:
data=json.loads(open(filetoread).read())
return data
except ValueError:
print 'data loading error'
cat_data=load_existed(filetoread)
def walk_dict(d,mess,ln,new_dict,crumbs):
inter=1
lc=list(mess)
last_crumb=crumbs.split( )
for k,v in sorted(d.items(),key=lambda x: x[0]):
if mess=='':
mess=str(inter)
lc=list(mess)
last_crumb=crumbs.split( )
if isinstance(v, dict) :
ln=len(v)
lc[len(lc)-1]=str(inter)
The goal is to make new dict where the keys are numbers of hierarchy and values must be strings that are composed of all previous steps like
2 drinks
2.1 drinks coffee
2.1.1 drinks coffee instant
2.1.2 drinks coffee real
2.2 drinks tea
2.3 drinks waterI wrote a script, and it works. But as I am new to Python and in programming, please check it out and correct me in order to code may look and perform more elegantly and in a pythonic way.
Also I would like to decrease time of execution because the input .json file is going to be quite big.
Sample cat.json file:
{
"communication":
{"mobile":{
"vodafone":{"subscr":"","txt":"","mms":"","internet":"","calls":
{"in":{"home":"","roaming":""},"out":{"home":"","roaming":""}}},
"verizon":{"subscr":"","txt":"","mms":"","internet":"1Gb","calls":
{"in":{"home":"","roaming":""},"out":{"home":"500 min","roaming":"Other country"}}}},
"internet":"SomeProviderName"
},
"food":{"dairy":{"cheese":"Gauda","milk":{"brand":"name","origin":"place"}}},
"drinks":
{
"water":"",
"tea":"",
"coffee":
{
"instant":"",
"real":""
}}}Script:
```
#!/usr/bin/env python -tt
# -- coding: utf-8 --
import json
filetoread='cat.json'
def load_existed(filetoread):
try:
data=json.loads(open(filetoread).read())
return data
except ValueError:
print 'data loading error'
cat_data=load_existed(filetoread)
def walk_dict(d,mess,ln,new_dict,crumbs):
inter=1
lc=list(mess)
last_crumb=crumbs.split( )
for k,v in sorted(d.items(),key=lambda x: x[0]):
if mess=='':
mess=str(inter)
lc=list(mess)
last_crumb=crumbs.split( )
if isinstance(v, dict) :
ln=len(v)
lc[len(lc)-1]=str(inter)
Solution
Some notes about your code:
Taking those into account, your
But the way you convert
Here's how I would do it:
- for readability, you should add spaces around operators
- you should use
withto open files; also, instead of first reading the file to a string and then usingjson.loads, just usejson.loadon the file itself
- instead of running your own counter with
inter, useenumerate
- using
key=lambda x: x[0]is pointless, it does not affect the sort order at all
- you are constantly converting between strings and lists; stick to one representation, e.g. here:
crumbs = " ".join(last_crumb)and right in the next line:if len(crumbs.split()) > 0:; instead, just testif last_crumb:orif crumbs:, whichever you prefer
lc[len(lc)-1]is the same aslc[-1]
- there's some code duplication in the
if/else; try to move that outside
- your function seems to miss some of the items, e.g. you never output
1.2.2.4 communication mobile vodafone subscr; this is because you overwrite thelnparameter withln = len(v); not sure why you need thatlnparameter anyway...
Taking those into account, your
walk_dict function can be simplified significantly:def walk_dict(d, mess, new_dict, crumbs):
lc = list(mess) if mess else ["1"]
for inter, (k, v) in enumerate(sorted(d.items()), 1):
lc[-1] = str(inter)
mess = "".join(lc)
if isinstance(v, dict) :
crumbs2 = (crumbs + " " + k) if crumbs else k
walk_dict(v, mess + '.1', new_dict, crumbs2)
else:
crumbs2 = crumbs + " " + k + " " + v
new_dict[mess] = crumbs2
return new_dictBut the way you convert
mess into a list lc and then replace the last element is still -- quite fitting -- a "mess". Same with the way you are mixing "output parameters" and return values.Here's how I would do it:
def walk_dict(d, key=None, parent=None):
res = {}
for i, e in enumerate(sorted(d), 1):
k = (key + "." + str(i)) if key else str(i)
p = (parent + " " + e ) if parent else e
if isinstance(d[e], dict):
res.update(walk_dict(d[e], k, p))
res[k] = p
else:
res[k] = p + " " + str(d[e])
return res
with open('data.json') as f:
cat_data = json.load(f)
new_data = walk_dict(cat_data)
for v in sorted(new_data):
print v, new_data[v]Code Snippets
def walk_dict(d, mess, new_dict, crumbs):
lc = list(mess) if mess else ["1"]
for inter, (k, v) in enumerate(sorted(d.items()), 1):
lc[-1] = str(inter)
mess = "".join(lc)
if isinstance(v, dict) :
crumbs2 = (crumbs + " " + k) if crumbs else k
walk_dict(v, mess + '.1', new_dict, crumbs2)
else:
crumbs2 = crumbs + " " + k + " " + v
new_dict[mess] = crumbs2
return new_dictdef walk_dict(d, key=None, parent=None):
res = {}
for i, e in enumerate(sorted(d), 1):
k = (key + "." + str(i)) if key else str(i)
p = (parent + " " + e ) if parent else e
if isinstance(d[e], dict):
res.update(walk_dict(d[e], k, p))
res[k] = p
else:
res[k] = p + " " + str(d[e])
return res
with open('data.json') as f:
cat_data = json.load(f)
new_data = walk_dict(cat_data)
for v in sorted(new_data):
print v, new_data[v]Context
StackExchange Code Review Q#88557, answer score: 5
Revisions (0)
No revisions yet.