patternpythonMinor

Probabilistically pick a candidate

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

pickprobabilisticallycandidate

Problem

I have a list of candidates which looks like this:

candidates = [
    {
        'name': 'John',
        'rank': 13
    },
    {
        'name': 'Joe',
        'rank': 8
    },
    {
        'name': 'Averell',
        'rank': 5
    },
    {
        'name': 'William',
        'rank': 2
    }
]

What I want to do is semi-randomly pick one of these candidates, based on its squared rank. So that a candidate A having a rank twice as big as B, will have 4 times more chances to be picked than B.

Here is a naive implementation of the idea I had to solve this problem:

def pick_candidate(candidates):
    # initiates a global probability space
    prob_space = []

    # bring the max rank to 20
    # to avoid useless CPU burning
    # while keeping a good granularity -
    # basically, just bring the top rank to 20
    # and the rest will be divided proportionally
    rate = candidates[0]['rank'] / 20

    # fills the probability space with the indexes
    # of 'candidates', so that prob_space looks like:
    # [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1]
    # if candidates[0] has rank 3 and candidates[1] has rank 2
    for i, c in enumerate(candidates):
        rank = c['rank'] / rate
        for j in range(int(rank*rank)):
            prob_space.append(i)

    # picks a random index from the probability space
    picked_prob_space_index = random.randint(0, len(prob_space)-1)

    # retrieves the matching candidate
    picked_candidate_index = prob_space[picked_prob_space_index]

    return candidates[picked_candidate_index]

The questions I'm thinking of, concerning the above code, are:

Concept: Is the core principle of the algorithm (the idea of building prob_space, etc) overkill and solvable more easily in other ways or with builtins

Implementation: This core principle put aside, what do you think of the implementation? Would you think of a better, cleaner way to write it?

Solution

Concept

imho you don't need build prob_space or normalize by max rank == 20

-
you only need

John: 1 ... 13^2
Joe:  13^2+1 ... 13^2+8^2
Averell: 13^2+8^2+1 ... 13^2+8^2+5^2
William: 13^2+8^2+5^2 ... 13^2+8^2+5^2+2^2
and choice a man having ranint(1, 13^2+8^2+5^2+2^2)

transform list [132, 82, 52, ...] to [132, 132+82, 132+82+5**2, ...] we can by accumulate

searching in this sorted list [132, 132+82, 132+82+52, ...] we can by binary search algorithm implemented in bisect module

-
the algorithm:

from collections import namedtuple
from itertools import accumulate
from random import randint
from bisect import bisect_left

Candidate = namedtuple('Candidate', 'name rank')

def pick_random_candidate(candidates):
    assert (len(candidates) > 0) # note: probably we want return None with empty list?
    squared_ranks_accumulated = list(accumulate(c.rank ** 2 for c in candidates))
    random_pick = randint(1, squared_ranks_accumulated[-1])
    return candidates[bisect_left(squared_ranks_accumulated, random_pick)]

if __name__ == "__main__":
    candidates = [Candidate(*t) for t in [('John', 13), ('Joe', 8), ('Averell', 5), ('William', 2)]]
    random_candidate = pick_random_candidate(candidates)
    print(random_candidate)

Implementation

better use from random import randint instead of import random

if you have randint(0, x - 1) you can replace it by randrange(x)

constants like 20 should be defined as variable, e.g. top_rank = 20 or even def pick_candidate(candidates, top_rank=20). Reason: if someone change 20 to 21 in next three months there are high probability that don't change comment (so better when comment explains top_rank not 20)

you can often define structure Candidate as namedtuple (dict is better only sometimes if have some dynamic fields to store)

what's with the empty list of candidates (exception KeyError is expected or something else should be?)

Code Snippets

John: 1 ... 13^2
Joe:  13^2+1 ... 13^2+8^2
Averell: 13^2+8^2+1 ... 13^2+8^2+5^2
William: 13^2+8^2+5^2 ... 13^2+8^2+5^2+2^2
and choice a man having ranint(1, 13^2+8^2+5^2+2^2)

from collections import namedtuple
from itertools import accumulate
from random import randint
from bisect import bisect_left

Candidate = namedtuple('Candidate', 'name rank')


def pick_random_candidate(candidates):
    assert (len(candidates) > 0) # note: probably we want return None with empty list?
    squared_ranks_accumulated = list(accumulate(c.rank ** 2 for c in candidates))
    random_pick = randint(1, squared_ranks_accumulated[-1])
    return candidates[bisect_left(squared_ranks_accumulated, random_pick)]


if __name__ == "__main__":
    candidates = [Candidate(*t) for t in [('John', 13), ('Joe', 8), ('Averell', 5), ('William', 2)]]
    random_candidate = pick_random_candidate(candidates)
    print(random_candidate)

Context

StackExchange Code Review Q#127805, answer score: 4

Revisions (0)

No revisions yet.