HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonModerate

A dictionary that allows multiple keys for one value

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
allowsvaluekeysonethatformultipledictionary

Problem

I'm trying to learn to program on my own with the help of books and the Internet and thought I'd start a project (a sub-project for a project).

I want to build a dictionary that allows multiple keys for one value:

d = mkdict()
d['what', 'ever'] = 'testing'
d['what'] is d['ever'] # True


I thought it would be good to get some advice from other programmers to better myself, and so I would like it if you were to give your honest opinion on how I've implemented it (the dos and don'ts) and possibly how you would approach things.

```
# -- coding: utf-8 --

class mkdict(object):
""" A dictionary that allows multiple keys for one value """

class _Container(object):
""" This is used to wrap an object to avoid infinite
recursion when calling my own methods from the inside.

If a method sees this container, it assumes it has been
called from the inside and not the user.
"""

def __init__(self, _object):
self.object = _object

class dict(object):
""" Interface for mkdict.dict for dict-like behaviour """

def __init__(self, d={}, **kwargs):
""" Using an mkdict._Container to avoid infinite
recursion when allowing:

>>> d = mkdict({'what': 'ever'})
>>> d = mkdict.dict({'what': 'ever'})
"""
if isinstance(d, mkdict._Container):
self.mkdict = d.object
else:
self.mkdict = mkdict(mkdict._Container(self))
self.update(d, **kwargs)

def __str__(self):
return str(self.mkdict._dict)

def __repr__(self):
return str(self)

def __len__(self):
return len(self.mkdict._dict)

def __setitem__(self, key, value):
""" Desired behaviour:

>>> d = mkdict()
>>>
>>> d['what', 'ever'] = 'testing'
>>> d
{'what': 'testing', 'ever': 'testing

Solution

I want to build a dictionary that allows multiple keys for one value:

This is a very basic relational database.

So the first piece of feedback is that new programmers often start coding with poorly defined test-cases or non-normalised test-cases whereby they get lost developing their test portfolio. If you would have started with clarifying your case, I'm sure you would have cracked this problem.

Let's start defining our operations from examples

Sometimes we will have the database maintaining links:

  • from 1 key to 1 value



  • from 1 key to N values



  • from N keys to 1 value



  • from N keys to N values



A dictionary can hold 1 key to N values, but not N keys to 1 value.

Fortunately the reverse operation of N keys to 1 value is the reverse of the dictionary, so we can trick around this by creating a class:

class Database(object):
    """ A dictionary that allows multiple keys for one value """

    def __init__(self):
        self.keys = {}
        self.values = {}


Next we need figure out how to get data into the database. This is similar to the SQL INSERT STATEMENT, where we maintain both our keys and values in single transactions:

def __setitem__(self, key, value):


Lets first insert a new key, whilst keeping in mind that the value might already exist:

if key not in self.keys:  
        if value not in self.values:  # it's a new value
            self.keys[key] = set()  # a new set
            self.keys[key].add(value)
            self.values[value] = set()  # a new set
            self.values[value].add(key)
        elif value in self.values:
            self.keys[key] = set()  # a new set
            self.keys[key].add(value)
            self.values[value].add(key) # (1)


(1) a new key to a known value only means an update to the values.

Next we worry about if it's a new relationship with a new value:

elif key in self.keys:  # 
        self.keys[key].add(value)
        if value not in self.values:
            self.values[value] = set()
            self.values[value].add(key)
        elif value in self.values:
            self.values[value].add(key)


We have now implemented the INSERT STATEMENT in our database.

Now it's time to worry about how to delete records and relationships. For this we hack the __delitem__ function to both take a key and a value. Why? Because otherwise we won't know whether the user wants to delete a single relationship only, OR all entries associated with the key

I thereby choose that:

  • if the value is None I delete every relationship associated with the key.



  • if the value is a valid relationship between the key and value, then I only delete that specific relationship.



This is how it works:

def __delitem__(self, key, value=None): 
    if value is None:
        # All the keys relations are to be deleted.
        try:
            value_set = self.keys[key]
            for value in value_set:
                self.values[value].remove(key)
                if not self.values[value]:
                    del self.values[value]
            del self.keys[key]  # then we delete the key.
        except KeyError:
            raise KeyError("key not found")
    else:  # then only a single relationships is being removed.
        try:
            if value in self.keys[key]:  # this is a set.
                self.keys[key].remove(value)
                self.values[value].remove(key)
            if not self.keys[key]:  # if the set is empty, we remove the key
                del self.keys[key]
            if not self.values[value]:  # if the set is empty, we remove the value
                del self.values[value]
        except KeyError:
            raise KeyError("key not found")


Now we can add and delete. But how about bulk updates? They constitute a special case because __setitem__ can't see that you want to propagate your update onto multiple values. So we need to tell it WHAT to update. Here is a proposal that uses the relationship between the key, the old value and the new value that should be accessible to all keys that other would have a relationship with the old value:

def update(self, key, old_value, new_value):
    if old_value in self.keys[key]:
        affected_keys = self.values[old_value]
        for key in affected_keys:
            self.__setitem__(key, new_value)
            self.keys[key].remove(old_value)
        del self.values[old_value]
    else:
        raise KeyError("key: {} does not have value: {}".format(key,old_value))


So far so good. A slightly annoying thing is that we can't really read what is in the database even though we can get things in and delete them. This calls for the SELECT STATEMENT - or python's __getitem__ method. But(!) we need to be careful, as our database stores data internally as a sets that are accessible from keys. So we need to unpack them onto something useful. As I like working with lists, I have chosen to provide lists, unless its a single value, whereby I only re

Code Snippets

class Database(object):
    """ A dictionary that allows multiple keys for one value """

    def __init__(self):
        self.keys = {}
        self.values = {}
def __setitem__(self, key, value):
if key not in self.keys:  
        if value not in self.values:  # it's a new value
            self.keys[key] = set()  # a new set
            self.keys[key].add(value)
            self.values[value] = set()  # a new set
            self.values[value].add(key)
        elif value in self.values:
            self.keys[key] = set()  # a new set
            self.keys[key].add(value)
            self.values[value].add(key) # (1)
elif key in self.keys:  # 
        self.keys[key].add(value)
        if value not in self.values:
            self.values[value] = set()
            self.values[value].add(key)
        elif value in self.values:
            self.values[value].add(key)
def __delitem__(self, key, value=None): 
    if value is None:
        # All the keys relations are to be deleted.
        try:
            value_set = self.keys[key]
            for value in value_set:
                self.values[value].remove(key)
                if not self.values[value]:
                    del self.values[value]
            del self.keys[key]  # then we delete the key.
        except KeyError:
            raise KeyError("key not found")
    else:  # then only a single relationships is being removed.
        try:
            if value in self.keys[key]:  # this is a set.
                self.keys[key].remove(value)
                self.values[value].remove(key)
            if not self.keys[key]:  # if the set is empty, we remove the key
                del self.keys[key]
            if not self.values[value]:  # if the set is empty, we remove the value
                del self.values[value]
        except KeyError:
            raise KeyError("key not found")

Context

StackExchange Code Review Q#85842, answer score: 10

Revisions (0)

No revisions yet.