Python: Selecting numbers with associated probabilities [duplicate]

2024/10/18 14:48:43

Possible Duplicates:
Random weighted choice
Generate random numbers with a given (numerical) distribution

I have a list of list which contains a series on numbers and there associated probabilities.

prob_list = [[1, 0.5], [2, 0.25], [3, 0.05], [4, 0.01], [5, 0.09], [6, 0.1]]

for example in prob_list[0] the number 1 has a probability of 0.5 associated with it. So you would expect 1 to show up 50% of the time.

How do I add weight to the numbers when I select them?

NOTE: the amount of numbers in the list can vary from 6 - 100


EDIT

In the list I have 6 numbers with their associated probabilities. I want to select two numbers based on their probability.

No number can be selected twice. If "2" is selected it can not be selected again.

Answer

I'm going to assume the probabilities all add up to 1. If they don't, you're going to have to scale them accordingly so that they do.

First generate a uniform random variable [0, 1] using random.random(). Then pass through the list, summing the probabilities. The first time the sum exceeds the random number, return the associated number. This way, if the uniform random variable generated falls within the range (0.5, 0.75] in your example, 2 will be returned, thus giving it the required 0.25 probability of being returned.

import random
import sys
def pick_random(prob_list):r, s = random.random(), 0for num in prob_list:s += num[1]if s >= r:return num[0]print >> sys.stderr, "Error: shouldn't get here"

Here's a test showing it works:

import collections
count = collections.defaultdict(int)
for i in xrange(10000):count[pick_random(prob_list)] += 1
for n in count:print n, count[n] / 10000.0

which outputs:

1 0.498
2 0.25
3 0.0515
4 0.0099
5 0.0899
6 0.1007

EDIT: Just saw the edit in the question. If you want to select two distinct numbers, you can repeat the above until your second number chosen is distinct. But this will be terribly slow if one number has a very high (e.g. 0.99999999) probability associated with it. In this case, you could remove the first number from the list and rescale the probabilities so that they sum to 1 before selecting the second number.

https://en.xdnf.cn/q/73303.html

Related Q&A

Weights and Biases: Login and network errors

I recently installed Weights and Biases (wandb) for recording the metrics of my machine learning projects. Everything worked fine when connected to wandb cloud instance or when I used a local docker im…

Python: Opening a file without creating a lock

Im trying to create a script in Python to back up some files. But, these files could be renamed or deleted at any time. I dont want my script to prevent that by locking the file; the file should be abl…

Tensorflow dataset questions about .shuffle, .batch and .repeat

I had a question about the use of batch, repeat and shuffle with tf.Dataset.It is not clear to me exactly how repeat and shuffle are used. I understand that .batch will dictate how many training exampl…

How to sort in python with multiple conditions?

I have a list with sublists as follows:result = [ [helo, 10], [bye, 50], [yeah, 5], [candy,30] ]I want to sort this with three conditions: first, by highrest integer in index 2 of sublist, then by leng…

Not able to convert Numpy array to OpenCV Mat in Cython when trying to write c++ wrapper function

I am trying to implement cv::cuda::warpPerspective in python2, there is a very sweet post about how to do that here: link. I followed the instruction as described in that post, however, I got Segmentat…

Installing python tables on mac with m1 chip

I am trying to use tables in python3 on a new mac mini with the M1 chip. I am getting multiple errors when running HDF5_DIR=/opt/homebrew/Cellar/hdf5/1.12.0_1 pip3 install tablesERROR: Command errored …

Write unbuffered on python 3

Im trying to create a file on python without buffer, so its written at the same time I use write(). But for some reason I got an error. This is the line Im using: my_file = open("test.txt", &…

which should I use (for python-based sites)? sass, compass, switchcss...alternatives? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic…

Invalid tag name error when creating element with lxml in python

I am using lxml to make an xml file and my sample program is :from lxml import etree import datetime dt=datetime.datetime(2013,11,30,4,5,6) dt=dt.strftime(%Y-%m-%d) page=etree.Element(html) doc=etree.E…

Method replacement at runtime not updating Private attributes

I understood how to replace methods at run time in Python by going through these links.[ Link1 , Link2 , & Link3].When I replaced a "update_private_variable" method of class A, its gettin…