HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Python lazy dict with iterator in constructor

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
dictiteratorwithlazyconstructorpython

Problem

This question began as an off-topic answer to this question, but the code here serves a different goal.

I wrote the following class for the purpose of populating a dict on demand from an iterator. The intent of the iterator passed to the constructor is that it could alternatively be passed to dict, which would consume the entire iterator in its constructor; an instance of this class consumes the iterator just far enough to locate a requested item. Such an iterator would be similar in spirit to a return value from dict.items.

class LazyDict(dict):
    """ A dict built on demand from an iterator """

    def __init__(self, iterator):
        super().__init__()
        self.iterator = iterator

    def __getitem__(self, item):
        while not self.get(item):
            try:
                (key, value) = next(self.iterator)
                self[key] = value
            except StopIteration:
                raise AttributeError
        return super().__getitem__(item)

    def __contains__(self, item):
        try:
            self[item] # pylint: disable=pointless-statement                                                                                                                                                                                                                                                                                                            
            return True
        except AttributeError:
            return False


Here is my calling code (with other details of the Directory code omitted; if more of that is needed in this context I can provide it):

```
class Directory(object):
@Lazy
def hash(self):
""" Lazy dict mapping entry names to entries """
return LazyDict((self.name(entry), entry) for entry in self.readdir())

def __contains__(self, name):
return name in self.hash # pylint: disable=unsupported-membership-test

Solution


  1. Design



I think that it's a mistake to inherit from dict. My reasoning is as follows:

-
LazyDict is effectively read-only: that is, setting an item does not update the underlying iterator. (In the described use case, you can't update the compressed file image through this dictionary.) So it is misleading to offer (as you do) a __setitem__ method (and similarly for update, pop, setdefault and other mutating methods).

-
Programmers are used to the equivalence of dictionary methods: for example, they "know" that d.get(k) behaves just the same as d[k] if k in d else None. But in your implementation it does not — calling d.get(k) does not consult the iterator but calling k in d or d[k] does. This seems like a recipe for confusion and bugs.

So I think the better approach is not to inherit from dict, but to have a dictionary as an attribute. This means that users will only be able to call the methods that you choose to implement, instead of accidentally calling through to methods on the underlying dict. This approach also makes the code clearer, because you can distinguish key in self from key in self._dict without needing to use super() or have pylint annnotations.

  1. Other review comments



-
The parameter to the __getitem__ and __contains__ methods would be better named key.

-
The exception raised from a failed key lookup should be KeyError, not AttributeError, for consistency with dict.

-
The exception raised on a failed key lookup should include the failed key, for consistency with dict and to help programmers track down errors.

-
It's a good idea to keep try: ... except: ... blocks as small as possible, so that you don't accidentally capture exceptions that you weren't expecting. In this case you are expecting a StopIteration from the call to next so that's the only line that needs to be protected.

-
It's conventional to give attributes that are not intended to be used outside of the class (like iterator) the prefix _.

  1. Revised code



class LazyDict:
    """A dictionary built on demand from an iterator."""
    def __init__(self, iterator):
        self._dict = {}
        self._iterator = iterator

    def __getitem__(self, key):
        if key in self:
            return self._dict[key]
        else:
            raise KeyError(key)

    def __contains__(self, key):
        while key not in self._dict:
            try:
                k, v = next(self._iterator)
            except StopIteration:
                return False
            self._dict[k] = v
        return True


Now someone who tries to call the get method will get an exception instead of silently getting the wrong result.

Code Snippets

class LazyDict:
    """A dictionary built on demand from an iterator."""
    def __init__(self, iterator):
        self._dict = {}
        self._iterator = iterator

    def __getitem__(self, key):
        if key in self:
            return self._dict[key]
        else:
            raise KeyError(key)

    def __contains__(self, key):
        while key not in self._dict:
            try:
                k, v = next(self._iterator)
            except StopIteration:
                return False
            self._dict[k] = v
        return True

Context

StackExchange Code Review Q#132469, answer score: 3

Revisions (0)

No revisions yet.