patternpythonMinor
A python default dictionary which seamlessly saves to disk
Viewed 0 times
diskseamlesslydefaultpythonwhichdictionarysaves
Problem
I sometimes do experiments at work and separate the computation and the analysis so I can do the computation on a cluster and the analysis locally and sometimes in a Jupyter notebook. I wrote a class which allows me to save results to a hidden file as if it was a dictionary. The idea is to create an object specifying the name of the experiment and from there you can use it as a dictionary, and it is saved to disk so you can access it from other python files. I'd appreciate any thoughts since IO isn't my forte. I used python 2.7 but I think it should work for python 3.0
import os
import cPickle as pickle
class FileDict():
def __init__(self, name, default = None):
self.fpath = '.{}.fd'.format(name)
self.default = default
def __getitem__(self, key):
if os.path.isfile(self.fpath):
d = pickle.load(open(self.fpath))
if key in d:
return d[key]
else:
return self.default
def __setitem__(self, key, value):
if os.path.isfile(self.fpath):
d = pickle.load(open(self.fpath))
d[key] = value
else:
d = {key : value}
pickle.dump(d, open(self.fpath, 'w'))
if __name__ == '__main__':
test = FileDict('test', 0)
print(test[1])
test[1] = 'thing'
print(test[1])
print(test[2])Solution
- I think you should rather returns a default value if file exists but key is not found in it.
- Opening file every time you want to get or set a value is expensive. Consider reading it in
__init__method, saving the data in handler registered usingatexitmodule and adding aflush()method if for some reason you'd like to dump data right now. You could also add someno_cacheoption in__init__to force saving thins right away?
picklemodule is potentially insecure. Consider famous exampleimport pickle; pickle.loads("cantigravity\n")If you open a maliciously prepared file with your class a weird things can happen. It should be used for internal used only, not for a general purpose class that can read any input.
- What if multiple instances of FileDict will use the same file as source? Consider using
tempfilemodule or generating unique names withuuidmodule, and then allowing returning generated the name with some method
- Following
collections.defaultdictexample you might want to use a callable to generate a default value. You can always passlambda: 0if you want to have only one value returned
Context
StackExchange Code Review Q#157809, answer score: 3
Revisions (0)
No revisions yet.