HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Checking if a random number exist within an interval efficiently

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
randomefficientlynumbercheckingintervalexistwithin

Problem

I have a problem where you have a bunch of probabilities associated with a number.

In this case:

  • any number between 0 - 0.3 should return 0,



  • any number between 0.3 - 0.7 should return 1,



  • any number between 0.7 - 1.0 should return 2



Is there a better way I can achieve this?

import random

class ProbChecker(object):

    def __init__(self, nums, prob):
        self._nums = nums
        self._prob = prob

    def next_val(self):
        if sum(self._prob) != 1.0:
            raise ValueError("Probabilities does not sum up to 1.0")

        accumulate = [sum(self._prob[: idx+1]) for idx in range(len(self._prob))]
        randomVal = random.random()

        for idx, x in enumerate(accumulate):
            if randomVal < x :
                return randomVal, self._nums[idx]

if  __name__ =='__main__':
    random_nums = [0, 1, 2]
    probabilities = [0.3, 0.4, 0.3]
    gen = ProbChecker(random_nums, probabilities)
    for i in xrange(100):
        print gen.next_val()

# EXAMPLE OUTPUT
#(0.9771170760903612, 2)
#(0.738653579103415, 2)
#(0.33448331523062824, 1)
#(0.6458688675258738, 1)
#(0.7062504910283919, 2)
#(0.4159743894291702, 1)
#(0.2636825176117218, 0)

Solution

What you are basically asking is for a way to generate values with different weights.

In this case, you want 0 to be generated 30% of the time, 1 to be generated 40% of the time and 2 to be generated 30% of the time. Rather than checking intervals, it is much easier to generate a sequence that has elements that correspond to the probabilities.

For example, we could replicate the probabilities needed by having a 10-element list like so: [0, 0, 0, 1, 1, 1, 1, 2, 2, 2] and then pick 1 random element from it. How did I get 10? 10 is the minimal number of elements that evenly divides 30, 40 and 30 (the probabilities) multiplied by the sum of the weights (30 + 40 + 30)/10 = 10.

Another example is if we wanted 0 to occur 1% of the time and 1 to occur 99% of the time, then we could pick one element from [0, 1, ..., 1] where there are 99 1's. The 100 elements comes from (1 + 99 (sum of weights))/ 1 (divisor, or gcd to be precise) = 100.

Now assuming that all your probabilities are integral percentages (1%, 2%, ... 100%). Then you can always use 100 elements. Then for 5% you repeat the element 5x, 12x for 12% and so on.

Now let's do some code:

Lets say you build something like this: weighted_choices = [(0, 30), (1, 40), (2, 30)], where the 0, 1, 2 are the values and 30, 40, 30 are the percentage weights.

Then the code to select a item based on the weight is like so:

sample = [val for val, cnt in choices for i in range(cnt)]
random.choice(sample)


Note: This is code is taken straight from the random module documentation

If you want a simple, but slower version:

sample = []
for value, weight in choices:
    for i in range(0, weight):
        sample.append(value)
random.choice(sample)


If you want to read up on an even more general variation, go look at random's documentation

Code Snippets

sample = [val for val, cnt in choices for i in range(cnt)]
random.choice(sample)
sample = []
for value, weight in choices:
    for i in range(0, weight):
        sample.append(value)
random.choice(sample)

Context

StackExchange Code Review Q#58945, answer score: 7

Revisions (0)

No revisions yet.