HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Managing and searching objects using tags

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
managingsearchingobjectstagsusingand

Problem

I wonder

-
Is it appropriate to hide imported classes (collections and UserDict in this case) from Python IDE (e.g. IPython)?

-
Is there a more efficient algorithm/implementation?

Please feel free to comment on how you would improve this class.

```
import collections as _collections
import UserDict as _UserDict

class _IdDict(_UserDict.IterableUserDict):
def __missing__(self,key):
raise KeyError("The item requested is not in the TagDict. "+\
"Perhaps more than one item were requested.")

class TagDict(object):
'''
TagDict is similar to a dictionary except

Keys are unique tags/attributes of all the items
Each key can be mapped onto multiple items that have
that key as a tag

TagDict[*tags] returns a list of items that share the same tags
TagDict["*"] returns all the items
'''
def __init__(self):
# Keys are tags. Values are sets of ids
self.data = _collections.defaultdict(set)
# Keys are ids of the objects. Values are (object,tags)
self._ids = _IdDict()

def add(self,item,tags):
''' Add an item with a list of tags
if tags is empty, the item will not be added to
the TagDict
Input:
item - an object
tags - a string or a list of strings
'''
if type(tags) is str: tags = [tags,]
tags = set(tags)
self._ids[id(item)] = (item,tags)
for tag in tags:
self.data[tag].add(id(item))

def __getitem__(self,tags):
''' Get the items that share the tags

Return:
A list of object
If the list contains only one object, return the object

Example:
TagDict["a","b"] returns all items that have both "a"
and "b" as tags
TagDict["*"] returns all the items in the TagDict

'''
if tags[0] == '*':
return [ value[0] for value in self._ids.values() ]
if type(tags)

Solution

Sean Perry has made several good points, which I'll not duplicate. Though, I think your imports with leading underscores are fine! A leading underscore in a global name suggests that the item is not part of the module's public interface.

First off, There is usually no need to use UserDict in new code. As the docs for that module say:


The need for this class has been largely supplanted by the ability to subclass directly from dict (a feature that became available starting with Python version 2.2). Prior to the introduction of dict, the UserDict class was used to create dictionary-like sub-classes that obtained new behaviors by overriding existing methods or adding new ones.

So, your _IdDict class should probably inherit from dict directly, rather than from UserDict unless you need to support Python versions older than 2.2! This will also improve your forward compatibility, as the UserDict module has been removed in Python 3.

Or, you could probably do without the special dict subclass entirely, and handle the exception raising in the TagDict yourself. Just catch whatever exception gets raised by a normal dictionary, and raise your own (in Python 3, you'd want to use raise Whatever() from None to suppress the previous exception context, but in Python 2 that's neither possible nor necessary).

I see a few things that could be improved in your TagDict class itself.

You should probably add a check in add to make sure the item being added isn't in the dictionary already. If it is and the tags it's being added under are not the same as the ones it was under previously, you may end up with inconsistent information in your data and _ids dicts.

In __getitem__ you have the expression self.data[tag] if tag in self.data else set() in your list comprehension. You can write this more concisely as self.data.get(tag, set()).

But you might need to think about whether that is what you actually want to happen. If a requested tag is not found in the data dictionary, the intersection of the sets is going to be empty. This means you'll end up returning an empty tuple. Perhaps you should raise an exception instead?

Context

StackExchange Code Review Q#49555, answer score: 3

Revisions (0)

No revisions yet.