HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Python Weighted Object Picker

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
weightedobjectpickerpython

Problem

I've designed a class ObjectPicker that can be given objects and weights. Objects can be picked randomly out of the ObjectPicker, with heavier weighted objects having a higher chance of being selected.

The pick() method is designed to be recursive and provide \$O(\log n)\$ speed. What I'm not sure about is the defaulting that's currently in place. I'm wondering if there is a better way to handle it. However, I would appreciate any feedback you are willing to provide beyond the specific.

```
class ObjectPicker:
"""
An object that stores other objects with weights, and is able to
pick out one stored object, with a higher chance for higher weighted
objects.
"""

def __init__(self):
"""
Initialize the Object Picker object. Create the instance variable
bucket for storing objects.
"""
self.bucket = []

def add(self, item, weight=1):
"""
Add an item to the bucket, with given weight.
:param item: Object to be stored.
:param weight: Weight given to the object to determine odds.
"""
if self.bucket: # If there is anything in the bucket.
total = self.bucket[-1][-1] # Get the end number of the last item.
else: # If the bucket is empty.
total = 0 # End number is 0.

# Object stored in Tuple.
# (Object, Weight, Starting Num, Ending Num)
self.bucket.append((item, weight, total, total + weight))

def pick(self, choice=-1, storage=None):
"""
Pick an object from the bucket recursively,
taking weight into account.
:param choice: Number of choice.
:param storage: Storage Object to choose from.
"""
if not storage: # If no storage is given.
if not self.bucket: # If bucket is empty.
return None # Return None.
else:
storage = self.bucket # Storage is bucket
total = storage[-1]

Solution

doc strings

Good. I like seeing doc strings. But, let's take a closer look.

class ObjectPicker:
    """
    An object that stores other objects with weights, and is able to
    pick out one stored object, with a higher chance for higher weighted
    objects.
    """


Ok, that is a lot of text. Can we separate this out into two sections with a short title?

class ObjectPicker:
    """
    Choose a random element taking weights into account.

    Elements with a higher weight have a higher change of being chosen.
    """


On to the init.

def __init__(self):
        """
        Initialize the Object Picker object. Create the instance variable
        bucket for storing objects.
        """
        ...


There are two things wrong with it. It is telling me the following:

  • It initialises the Object Picker object: That's what __init__ is supposed to do. Good. But that does not really warrant mentioning. So you can leave that portion out.



  • It creates an instance variable bucket for storing objects: That's an implementation detail. Where and when the bucket variable is initialized does not really matter. That it creates a variable bucket instead of storage or something else does not matter. Again, leave that out.



So we move to

def __init__(self):
        ...


(That is, remove the doc string.)

add and pick can be given similar treatments. Importantly, add an extra empty line between the documentation, and the parameter description.

Comments

Look at the following code

if self.bucket:  # If there is anything in the bucket.
    total = self.bucket[-1][-1]  # Get the end number of the last item.
else:  # If the bucket is empty.
    total = 0  # End number is 0.


The comments for the if and else are really trivial. You can easily drop them. Please do.

Consider what your comments add to the code. If they detail how it's implemented, try very hard to find a way to remove them without reducing the clarity. The following is just as clear.

if self.bucket:
    total = self.bucket[-1][-1]  # Get the end number of the last item.
else:
    total = 0


You notice I left 1 comment in. That's because of the end number. If you follow the namedtuple advice below, you can replace that line with

total = self.bucket[-1].end


(without the comment). The code is just as clear.

Excessive commenting in general

Another comment I'd just like to point out. Somewhere in the code I see

return None  # Return None.


As an outsider, comments like this make me believe that you are a beginner in Python, or even programming in general. Probably even following a class where the lecturer dictates comments like that.

Commenting like this screams 'I know nothing'. However, when I read the code without the comments, it's very readable, and only a few comments remain that are actually necessary. Actually, just two:

# Get the end number of the last item.


and

# Start binary search in middle of storage object.


To remove them, you'd need to refactor your code. And later on I'll give you the ingredients needed in a bit more detail.

You might want to leave in some more comments, depending on how comfortable you are, but make sure that the comments add value. Commonly, that's done by explaining why a piece of code is written, not what it does.

Let me state the following a bit more clearly. It is the excessive commenting that makes me doubt your experience. Reading your code (and explanation) tells me a different story: You know how to handle a binary search (and from what I can see, correctly), you know something about complexity theory. This is somebody to be reckoned with.

Algorithm

Storing of elements.

You use a tuple to store the elements, leading to code such as

total = self.bucket[-1][-1]


and

start, end = storage[index][2:]


The tuple is always the same size, and the same form. So, why not make it a namedtuple? Put the following import at the top of your file:

from collections import namedtuple

RangedElement = namedtuple('RangedElement', ['value', 'length', 'start', 'end'])


(I cheated a bit, I moved from weights to lengths, because it somewhat makes more sense when talking about start and end.)

Then, instead of

self.bucket.append((item, weight, total, total + weight))


write

self.bucket.append(RangedElement(item, weight, total, total + weight))


Yes, it's more typing. But, instead of

total = storage[-1][-1]


you can now write

total = storage[-1].end


Which is much better.

Getting the (random) element.

Defaulting

You already mentioned you were not sure about the defaulting. Important should be the question 'why the defaulting'?

From what I can see, the only reason for the defaulting is the recursive algorithm you use, which is a recursive bisection algorithm (or binary search algorithm). It would be good to separate the bisecting from the choosing.

```
def _bisect

Code Snippets

class ObjectPicker:
    """
    An object that stores other objects with weights, and is able to
    pick out one stored object, with a higher chance for higher weighted
    objects.
    """
class ObjectPicker:
    """
    Choose a random element taking weights into account.

    Elements with a higher weight have a higher change of being chosen.
    """
def __init__(self):
        """
        Initialize the Object Picker object. Create the instance variable
        bucket for storing objects.
        """
        ...
def __init__(self):
        ...
if self.bucket:  # If there is anything in the bucket.
    total = self.bucket[-1][-1]  # Get the end number of the last item.
else:  # If the bucket is empty.
    total = 0  # End number is 0.

Context

StackExchange Code Review Q#121907, answer score: 6

Revisions (0)

No revisions yet.