Is random.sample truly random?

2024/7/6 22:18:47

I have a list with 155k files. When I random.sample(list, 100), while the results are not the same from the previous sample, they look similar.

Is there a better alternative to random.sample that returns a new list of random 100 files?

folders = get_all_folders('/data/gazette-txt-files')
# get all files from all folders
def get_all_files():files = []for folder in folders:files.append(glob.glob("/data/gazette-txt-files/" + folder + "/*.txt"))# convert 2D list into 1Dformatted_list = []for file in files:for f in file:formatted_list.append(f)# 200 random text filesreturn random.sample(formatted_list, 200)
Answer

For purposes like randomly selecting elements from a list, using random.sample suffices, true randomness isn't provided and I'm unaware if this is even theoretically possible.

random (by default) uses a Pseudo Random Number Generator (PRNG) called Mersenne Twister (MT) which, although suitable for applications such as simulations (and minor things like picking from a list of paths), shouldn't be used in areas where security is a concern due to the fact that it is deterministic.

This is why Python 3.6 also introduces secrets.py with PEP 506, which uses SystemRandom (urandom) by default and is capable of producing cryptographically secure pseudo random numbers.

Of course, bottom line is, that even if you use a PRNG or CPRNG to generate your numbers they're still going to be pseudo random.

https://en.xdnf.cn/q/119921.html

Related Q&A

how to extract a table column data present in pdf and stored inside a variable python

I have 3 tables (image pasted) all 3 table(have same columns) look same and i want data of address column (yellow colour) of 3 tables stored inside a variable.

Pong Created in Python Turtle

Im new to Python but Ive coded in other languages, mainly for hardware. I made pong in Python using turtle but its a little glitchy. I was wondering if any of you could check it out and give advice. I …

How to build a Neural Network with sentence embeding concatenated to pre-trained CNN

I want to build a neural network that will take the feature map from the last layer of a CNN (VGG or resnet for example), concatenate an additional vector (for example , 1X768 bert vector) , and re-tra…

Obtaining values from columns in python

I want to obtain the top 3 cities and items based on their sales, but the only thing I can do now is return the all cities and items with their respective sales. Without using dict, can I obtain my des…

Is there a really efficient (FAST) way to read large text files in python?

I am looking to open and fetch data from a large text file in python as fast as possible (It almost has 62603143 lines - size 550MB). As I dont want to stress my computer, I am doing it by following wa…

How to extract all K*K submatrix of matrix with or without NumPy?

This is my input: row=6 col=9 6 9 s b k g s y w g f r g y e q j j a s s m s a s z s l e u s q u e h s s s g s f h s s e s g x d r h g y s s sThis is my code: r=int(input()) c=int(input()) n=min(r,c) k=…

How to scrape multiple result having same tags and class

My code is accurate for single page but when I run this code for multiple records using for loop and if there are some data missing like person then (as I used index no[1] and [2] for person variable ,…

Is there an alternative for sys.exit() in python?

try:x="blaabla"y="nnlfa" if x!=y:sys.exit()else:print("Error!") except Exception:print(Exception)Im not asking about why it is throwing an error. I know that it raises e…

Adding items to Listbox in Python Tkinter

I would like my Listbox widget to be updated upon clicking of a button. However I encountered a logic error. When I click on the button, nothing happens. No errors at all.listOfCompanies: [[1, ], [2, -…

Policy based design in Python [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.Want to improve this question? Add details and clarify the problem by editing this post.Closed 9 years ago.Improve…