patternpythonMinor
Probabilistically pick a candidate
Viewed 0 times
pickprobabilisticallycandidate
Problem
I have a list of candidates which looks like this:
What I want to do is semi-randomly pick one of these candidates, based on its squared rank. So that a candidate
Here is a naive implementation of the idea I had to solve this problem:
The questions I'm thinking of, concerning the above code, are:
candidates = [
{
'name': 'John',
'rank': 13
},
{
'name': 'Joe',
'rank': 8
},
{
'name': 'Averell',
'rank': 5
},
{
'name': 'William',
'rank': 2
}
]What I want to do is semi-randomly pick one of these candidates, based on its squared rank. So that a candidate
A having a rank twice as big as B, will have 4 times more chances to be picked than B.Here is a naive implementation of the idea I had to solve this problem:
def pick_candidate(candidates):
# initiates a global probability space
prob_space = []
# bring the max rank to 20
# to avoid useless CPU burning
# while keeping a good granularity -
# basically, just bring the top rank to 20
# and the rest will be divided proportionally
rate = candidates[0]['rank'] / 20
# fills the probability space with the indexes
# of 'candidates', so that prob_space looks like:
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1]
# if candidates[0] has rank 3 and candidates[1] has rank 2
for i, c in enumerate(candidates):
rank = c['rank'] / rate
for j in range(int(rank*rank)):
prob_space.append(i)
# picks a random index from the probability space
picked_prob_space_index = random.randint(0, len(prob_space)-1)
# retrieves the matching candidate
picked_candidate_index = prob_space[picked_prob_space_index]
return candidates[picked_candidate_index]The questions I'm thinking of, concerning the above code, are:
- Concept: Is the core principle of the algorithm (the idea of building
prob_space, etc) overkill and solvable more easily in other ways or with builtins
- Implementation: This core principle put aside, what do you think of the implementation? Would you think of a better, cleaner way to write it?
Solution
Concept
-
you only need
-
the algorithm:
Implementation
- imho you don't need build
prob_spaceor normalize by max rank == 20
-
you only need
John: 1 ... 13^2
Joe: 13^2+1 ... 13^2+8^2
Averell: 13^2+8^2+1 ... 13^2+8^2+5^2
William: 13^2+8^2+5^2 ... 13^2+8^2+5^2+2^2
and choice a man having ranint(1, 13^2+8^2+5^2+2^2)- transform list
[132, 82, 52, ...]to[132, 132+82, 132+82+5**2, ...]we can by accumulate
- searching in this sorted list
[132, 132+82, 132+82+52, ...]we can by binary search algorithm implemented in bisect module
-
the algorithm:
from collections import namedtuple
from itertools import accumulate
from random import randint
from bisect import bisect_left
Candidate = namedtuple('Candidate', 'name rank')
def pick_random_candidate(candidates):
assert (len(candidates) > 0) # note: probably we want return None with empty list?
squared_ranks_accumulated = list(accumulate(c.rank ** 2 for c in candidates))
random_pick = randint(1, squared_ranks_accumulated[-1])
return candidates[bisect_left(squared_ranks_accumulated, random_pick)]
if __name__ == "__main__":
candidates = [Candidate(*t) for t in [('John', 13), ('Joe', 8), ('Averell', 5), ('William', 2)]]
random_candidate = pick_random_candidate(candidates)
print(random_candidate)Implementation
- better use
from random import randintinstead ofimport random
- if you have
randint(0, x - 1)you can replace it byrandrange(x)
- constants like 20 should be defined as variable, e.g.
top_rank = 20or evendef pick_candidate(candidates, top_rank=20). Reason: if someone change 20 to 21 in next three months there are high probability that don't change comment (so better when comment explainstop_ranknot20)
- you can often define structure Candidate as
namedtuple(dictis better only sometimes if have some dynamic fields to store)
- what's with the empty list of
candidates(exceptionKeyErroris expected or something else should be?)
Code Snippets
John: 1 ... 13^2
Joe: 13^2+1 ... 13^2+8^2
Averell: 13^2+8^2+1 ... 13^2+8^2+5^2
William: 13^2+8^2+5^2 ... 13^2+8^2+5^2+2^2
and choice a man having ranint(1, 13^2+8^2+5^2+2^2)from collections import namedtuple
from itertools import accumulate
from random import randint
from bisect import bisect_left
Candidate = namedtuple('Candidate', 'name rank')
def pick_random_candidate(candidates):
assert (len(candidates) > 0) # note: probably we want return None with empty list?
squared_ranks_accumulated = list(accumulate(c.rank ** 2 for c in candidates))
random_pick = randint(1, squared_ranks_accumulated[-1])
return candidates[bisect_left(squared_ranks_accumulated, random_pick)]
if __name__ == "__main__":
candidates = [Candidate(*t) for t in [('John', 13), ('Joe', 8), ('Averell', 5), ('William', 2)]]
random_candidate = pick_random_candidate(candidates)
print(random_candidate)Context
StackExchange Code Review Q#127805, answer score: 4
Revisions (0)
No revisions yet.