I have an array of element probabilities, let's say [0.1, 0.2, 0.5, 0.2]
. The array sums up to 1.0.
Using plain Python or numpy, I want to draw elements proportional to their probability: the first element about 10% of the time, second 20%, third 50% etc. The "draw" should return index of the element drawn.
I came up with this:
def draw(probs):cumsum = numpy.cumsum(probs / sum(probs)) # sum up to 1.0, just in casereturn len(numpy.where(numpy.random.rand() >= cumsum)[0])
It works, but it's too convoluted, there must be a better way. Thanks.
import numpy as np
def random_pick(choices, probs):'''>>> a = ['Hit', 'Out']>>> b = [.3, .7]>>> random_pick(a,b)'''cutoffs = np.cumsum(probs)idx = cutoffs.searchsorted(np.random.uniform(0, cutoffs[-1]))return choices[idx]
How it works:
In [22]: import numpy as np
In [23]: probs = [0.1, 0.2, 0.5, 0.2]
Compute the cumulative sum:
In [24]: cutoffs = np.cumsum(probs)
In [25]: cutoffs
Out[25]: array([ 0.1, 0.3, 0.8, 1. ])
Compute a uniformly distributed random number in the half-open interval [0, cutoffs[-1])
:
In [26]: np.random.uniform(0, cutoffs[-1])
Out[26]: 0.9723114393023948
Use searchsorted to find the index where the random number would be inserted into cutoffs
:
In [27]: cutoffs.searchsorted(0.9723114393023948)
Out[27]: 3
Return choices[idx]
, where idx
is that index.