Question 1

I'm having trouble doing something as relatively simple as:

Draw N samples from a gaussian with some mean and variance
Take logs to those N samples
Fit a lognormal (using stats.lognorm.fit)
Spit out a nice and smooth lognormal pdf without inf values (using stats.lognorm.pdf)

Here's a small working example of the output I'm getting:

from scipy import stats
import numpy as np
import matplotlib.pyplot as plt
import math%matplotlib inlinedef lognormDrive(mu,variance):size = 1000sigma = math.sqrt(variance)np.random.seed(1)gaussianData = stats.norm.rvs(loc=mu, scale=sigma, size=size)logData = np.exp(gaussianData)shape, loc, scale = stats.lognorm.fit(logData, floc=mu)return stats.lognorm.pdf(logData, shape, loc, scale)plt.plot(lognormDrive(37,0.8))

enter image description here

And as you might notice, the plot makes absolutely no sense.

Any ideas?

I've followed these posts: POST1 POST2

Thanks in advance!

Elaboration: I am building a small script that will

Take raw data and fit a kernel distribution (emperical dist.)
Assume different distributions given the mean and variance of the data. This would be a gaussian and a lognormal
Plot those distributions together with the emperical dist using interact
Calculate the Kullbeck-Leibler divergence between the different distributions when one turns the knob for the mean and variance (and skew eventually)

Question 2

In the call to lognorm.fit(), use floc=0, not floc=mu.

(The location parameter of the lognorm distribution simply translates the distribution. You almost never want to do that with the log-normal distribution.)

See A lognormal distribution in python

By the way, you are plotting the PDF of the unsorted sample values, so the plot in the corrected script won't look much different. You might find it more useful to plot the PDF against the sorted values. Here's a modification of your script that creates a plot of the PDF using the sorted samples:

from scipy import stats
import numpy as np
import matplotlib.pyplot as plt
import mathdef lognormDrive(mu,variance):size = 1000sigma = math.sqrt(variance)np.random.seed(1)gaussianData = stats.norm.rvs(loc=mu, scale=sigma, size=size)logData = np.exp(gaussianData)shape, loc, scale = stats.lognorm.fit(logData, floc=0)print "Estimated mu:", np.log(scale)print "Estimated var: ", shape**2logData.sort()return logData, stats.lognorm.pdf(logData, shape, loc, scale)x, y = lognormDrive(37, 0.8)
plt.plot(x, y)
plt.grid()
plt.show()

The script prints:

Estimated mu: 37.0347152587
Estimated var:  0.769897988163

and creates the following plot:

plot

Fitting and Plotting Lognormal

Related Q&A

Is there any way to install nose in Maya?

Basic python socket server application doesnt result in expected output

creating dictionaries to list order of ranking

Python: How to use MFdataset in netCDF4

Pyspark: Concat function generated columns into new dataframe

Mysql.connector to access remote database in local network Python 3

concurrent.futures not parallelizing write

Querying SQLite database file in Google Colab

AttributeError: function object has no attribute self

Find file with largest number in filename in each sub-directory with python?