How to do a simple Gaussian mixture sampling and PDF plotting with NumPy/SciPy?

2024/10/1 5:39:50

I add three normal distributions to obtain a new distribution as shown below, how can I do sampling according to this distribution in python?

import matplotlib.pyplot as plt
import scipy.stats as ss
import numpy as npx = np.linspace(0, 10, 1000)
y1 = [ss.norm.pdf(v, loc=5, scale=1) for v in x]
y2 = [ss.norm.pdf(v, loc=1, scale=1.3) for v in x]
y3 = [ss.norm.pdf(v, loc=9, scale=1.3) for v in x]
y = np.sum([y1, y2, y3], axis=0)/3plt.plot(x, y, '-')
plt.xlabel('$x$')
plt.ylabel('$P(x)$')

BTW, is there a better way to plot such a probability distribution?

Answer

It seems that you're asking two questions: how do I sample from a distribution and how do I plot the PDF?

Assuming you're trying to sample from a mixture distribution of 3 normal ones shown in your code, the following code snipped performs this kind of sampling in the naïve, straightforward way as a proof-of-concept.

Basically, the idea is to

  1. Choose an index i among the index of components, i.e. 0, 1, 2 ..., according to their probability weights.
  2. Having chosen i, select the corresponding distribution and obtain a sample point from it.
  3. Continue from 1 until enough sample points are collected.

However, to plot the PDF, you don't really need a sample in this case, because the theoretical solution is quite easy. In the more general case, the PDF can be approximated by a histogram from the sample.

The code below performs both sampling and PDF-plotting using the theoretical PDF.

import numpy as np
import numpy.random
import scipy.stats as ss
import matplotlib.pyplot as plt# Set-up.
n = 10000
numpy.random.seed(0x5eed)
# Parameters of the mixture components
norm_params = np.array([[5, 1],[1, 1.3],[9, 1.3]])
n_components = norm_params.shape[0]
# Weight of each component, in this case all of them are 1/3
weights = np.ones(n_components, dtype=np.float64) / 3.0
# A stream of indices from which to choose the component
mixture_idx = numpy.random.choice(len(weights), size=n, replace=True, p=weights)
# y is the mixture sample
y = numpy.fromiter((ss.norm.rvs(*(norm_params[i])) for i in mixture_idx),dtype=np.float64)# Theoretical PDF plotting -- generate the x and y plotting positions
xs = np.linspace(y.min(), y.max(), 200)
ys = np.zeros_like(xs)for (l, s), w in zip(norm_params, weights):ys += ss.norm.pdf(xs, loc=l, scale=s) * wplt.plot(xs, ys)
plt.hist(y, normed=True, bins="fd")
plt.xlabel("x")
plt.ylabel("f(x)")
plt.show()

Overlaid image of two PDFs

https://en.xdnf.cn/q/70993.html

Related Q&A

Python dict.get() or None scenario [duplicate]

This question already has answers here:Truth value of a string in python(4 answers)Closed 7 years ago.I am attempting to access a dictionarys values based on a list of keys I have. If the key is not pr…

p-values from ridge regression in python

Im using ridge regression (ridgeCV). And Ive imported it from: from sklearn.linear_model import LinearRegression, RidgeCV, LarsCV, Ridge, Lasso, LassoCVHow do I extract the p-values? I checked but rid…

AutoTokenizer.from_pretrained fails to load locally saved pretrained tokenizer (PyTorch)

I am new to PyTorch and recently, I have been trying to work with Transformers. I am using pretrained tokenizers provided by HuggingFace.I am successful in downloading and running them. But if I try to…

How to scroll down in an instagram pop-up frame with Selenium

I have a python script using selenium to go to a given Instagram profile and iterate over the users followers. On the instagram website when one clicks to see the list of followers, a pop-up opens with…

Get starred messages from GMail using IMAP4 and python

I found many dummy info about working with IMAP, but I didnt understand how to use it for my purposes. I found how I can get ALL messages from mailbox and ALL SEEN messages, but how should I work with …

python and php bcrypt

I was using Laravel to register the users. It uses bcrypt like so:$2y$10$kb9T4WXdz5aKLSZX1OkpMOx.3ogUn9QX8GRZ93rd99i7VLKmeoXXXI am currently making another script that will authenticate users from anot…

Python socket library thinks socket is open when its not

Im working with a bit of Python that looks like this:HOST = 127.0.0.1 PORT = 43434 single = socket.socket(socket.AF_INET, socket.SOCK_STREAM) try:single.bind((HOST, PORT)) except socket.error as e:# Pr…

object of type _csv.reader has no len(), csv data not recognized

The following is a self-contained example. Change the "folder_name" to run it. This answers : reader type = _csv.reader list(reader) = [] _csv.reader has no len()I have tried many things but …

Lookup country for GPS coordinates without Internet access

I need to find out in what country given GPS coordinates are, on a device that has no Internet access (e.g. this, but without the easy on-line solution). Having no experience with GIS, I guess Id need …

how to get spyders python recognize external packages on MacOS X?

I have spyderlib installed on my MacOS X (10.6.8) using the official dmg file. In parallel, I have installed packages using both pip and homebrew from the terminal (i.e. opencv, gdal...). As Spyder is …