HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

A module to make JSON in Python easier

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
modulemakepythonjsoneasier

Problem

I recently wrote the livejson module as a way to make working with JSON in Python easier. The idea of the module is that you initialize a livejson.File, and then you can interact with your JSON file with the same interface as a Python dict, except your changes are reflected in the file in realtime.

I'd like to know how I can improve this.

  • Is my design good, in terms of things like naming, OOP design, etc.?



  • Are there ways I could improve performance without decreasing reliability? I implemented the context manager with a concept of "grouped writes" specifically so that repetitive operations could achieve the performance of the native objects.



  • Should I be concerned about thread safety?



The code

Here's the code. I've tried to document it well with comments and docstrings, and also tried to follow PEP8 so I hope it's readable:

Imports:

"""A module implementing a pseudo-dict class which is bound to a JSON file.

As you change the contents of the dict, the JSON file will be updated in
real-time. Magic.
"""

import collections
import os
import json


The code begins with some helper functions and generic base classes:

```
# MISC HELPERS

def _initfile(path, data="dict"):
"""Initialize an empty JSON file."""
data = {} if data.lower() == "dict" else []
# The file will need to be created if it doesn't exist
if not os.path.exists(path): # The file doesn't exist
# Raise exception if the directory that should contain the file doesn't
# exist
dirname = os.path.dirname(path)
if dirname and not os.path.exists(dirname):
raise IOError(
("Could not initialize empty JSON file in non-existant "
"directory '{}'").format(os.path.dirname(path))
)
# Write an empty file there
with open(path, "w") as f:
json.dump(data, f)
return True
elif len(open(path, "r").read()) == 0: # The file is empty
with open(path, "w") as

Solution

From a first look pretty clean code, and you also presented it in a nice
fashion, so off to a good start.

I'll go with your questions first:

  • Naming looks mostly good, though I'd wager that at some point generic


names with "Base" in it will make it harder to understand than
necessary - unfortunately I don't have a good suggestion as to what
other names would be more helpful.

  • Any unspecific question about performance should get the answer "take


a profiler and look" IMO. Really, there's more to worry about than
raw numbers, including the fact that this is Python. For starts the
regular json module also isn't the fastest. However correctness
comes first, so

  • not only is thread safety a concern, access from multiple processes is


a concern even more so. The one pattern that should really be used
here at least is to first create a temporary file with the
to-be-written content, then atomically moving that to the
destination. I'm not sure what's the canonical source for this, but
this blog post
is a nice start. The set_data method is particularly bad in that
respect.

Okay, so the code is almost PEP8 compatible, but there's still a lot of
camel case in there. IMO that's fine as long as it's consistent.

One note:

  • Replacing self.__class__ ... that works? Even if it did that looks


extremely fishy,
c.f. this SO post
for example. I share the sentiment that a different mechanism would
be better - if I see ...File() as a constructor call I definitely
don't expect a subclass of that to be the object in question. Not
doing "clever" things will save people minutes to hours of debugging
time, and this is definitely too clever. Just replace it with a
regular function like with livejson.mapped("test.json") as f: or so
and make it clear that you get a File subclass as a result would be
enough for me.

Context

StackExchange Code Review Q#133942, answer score: 3

Revisions (0)

No revisions yet.