patternpythonMinor
Python lazy dict with iterator in constructor
Viewed 0 times
dictiteratorwithlazyconstructorpython
Problem
This question began as an off-topic answer to this question, but the code here serves a different goal.
I wrote the following class for the purpose of populating a dict on demand from an iterator. The intent of the iterator passed to the constructor is that it could alternatively be passed to
Here is my calling code (with other details of the
```
class Directory(object):
@Lazy
def hash(self):
""" Lazy dict mapping entry names to entries """
return LazyDict((self.name(entry), entry) for entry in self.readdir())
def __contains__(self, name):
return name in self.hash # pylint: disable=unsupported-membership-test
I wrote the following class for the purpose of populating a dict on demand from an iterator. The intent of the iterator passed to the constructor is that it could alternatively be passed to
dict, which would consume the entire iterator in its constructor; an instance of this class consumes the iterator just far enough to locate a requested item. Such an iterator would be similar in spirit to a return value from dict.items.class LazyDict(dict):
""" A dict built on demand from an iterator """
def __init__(self, iterator):
super().__init__()
self.iterator = iterator
def __getitem__(self, item):
while not self.get(item):
try:
(key, value) = next(self.iterator)
self[key] = value
except StopIteration:
raise AttributeError
return super().__getitem__(item)
def __contains__(self, item):
try:
self[item] # pylint: disable=pointless-statement
return True
except AttributeError:
return FalseHere is my calling code (with other details of the
Directory code omitted; if more of that is needed in this context I can provide it):```
class Directory(object):
@Lazy
def hash(self):
""" Lazy dict mapping entry names to entries """
return LazyDict((self.name(entry), entry) for entry in self.readdir())
def __contains__(self, name):
return name in self.hash # pylint: disable=unsupported-membership-test
Solution
- Design
I think that it's a mistake to inherit from
dict. My reasoning is as follows:-
LazyDict is effectively read-only: that is, setting an item does not update the underlying iterator. (In the described use case, you can't update the compressed file image through this dictionary.) So it is misleading to offer (as you do) a __setitem__ method (and similarly for update, pop, setdefault and other mutating methods).-
Programmers are used to the equivalence of dictionary methods: for example, they "know" that
d.get(k) behaves just the same as d[k] if k in d else None. But in your implementation it does not — calling d.get(k) does not consult the iterator but calling k in d or d[k] does. This seems like a recipe for confusion and bugs.So I think the better approach is not to inherit from
dict, but to have a dictionary as an attribute. This means that users will only be able to call the methods that you choose to implement, instead of accidentally calling through to methods on the underlying dict. This approach also makes the code clearer, because you can distinguish key in self from key in self._dict without needing to use super() or have pylint annnotations.- Other review comments
-
The parameter to the
__getitem__ and __contains__ methods would be better named key.-
The exception raised from a failed key lookup should be
KeyError, not AttributeError, for consistency with dict.-
The exception raised on a failed key lookup should include the failed key, for consistency with
dict and to help programmers track down errors.-
It's a good idea to keep
try: ... except: ... blocks as small as possible, so that you don't accidentally capture exceptions that you weren't expecting. In this case you are expecting a StopIteration from the call to next so that's the only line that needs to be protected.-
It's conventional to give attributes that are not intended to be used outside of the class (like
iterator) the prefix _.- Revised code
class LazyDict:
"""A dictionary built on demand from an iterator."""
def __init__(self, iterator):
self._dict = {}
self._iterator = iterator
def __getitem__(self, key):
if key in self:
return self._dict[key]
else:
raise KeyError(key)
def __contains__(self, key):
while key not in self._dict:
try:
k, v = next(self._iterator)
except StopIteration:
return False
self._dict[k] = v
return TrueNow someone who tries to call the
get method will get an exception instead of silently getting the wrong result.Code Snippets
class LazyDict:
"""A dictionary built on demand from an iterator."""
def __init__(self, iterator):
self._dict = {}
self._iterator = iterator
def __getitem__(self, key):
if key in self:
return self._dict[key]
else:
raise KeyError(key)
def __contains__(self, key):
while key not in self._dict:
try:
k, v = next(self._iterator)
except StopIteration:
return False
self._dict[k] = v
return TrueContext
StackExchange Code Review Q#132469, answer score: 3
Revisions (0)
No revisions yet.