Python - SciPy Kernal Estimation Example - Density 1

2024/10/16 0:20:47

I'm currently working through this SciPy example on Kernal Estimation. In particular, the one labelled "Univariate estimation". As opposed to creating random data, I am using asset returns. My 2nd estimation though (and even the simply norm pdf I create to compare to) are showing a density that peaks at 20, which makes no sense... My code is as follows:

x1 = np.array(data['actual'].values)[1:]
xs1 = np.linspace(x1.min()-1,x1.max()+1,len(x1))
std1 = x1.std()
mean1 = x1.mean()x2 = np.array(data['log_moves'].values)[1:]
xs2 = np.linspace(x2.min()-.01,x2.max()+.01,len(x2))
#xs2 = np.linspace(x2.min()-1,x2.max()+2,len(x2))
std2 = x2.std()
mean2 = x2.mean()kde1 = stats.gaussian_kde(x1)  # actuals
kde2 = stats.gaussian_kde(x1, bw_method='silverman')kde3 = stats.gaussian_kde(x2)  # log returns
kde4 = stats.gaussian_kde(x2, bw_method='silverman')fig = plt.figure(figsize=(10,8))
ax1 = fig.add_subplot(211)
ax1.plot(x1, np.zeros(x1.shape), 'b+', ms=12)  # rug plot
ax1.plot(xs1, kde1(xs1), 'k-', label="Scott's Rule")
ax1.plot(xs1, kde2(xs1), 'b-', label="Silverman's Rule")
ax1.plot(xs1, stats.norm.pdf(xs1,mean1,std1), 'r--', label="Normal PDF")ax1.set_xlabel('x')
ax1.set_ylabel('Density')
ax1.set_title("Absolute (top) and Returns (bottom) distributions")
ax1.legend(loc=1)ax2 = fig.add_subplot(212)
ax2.plot(x2, np.zeros(x2.shape), 'b+', ms=12)  # rug plot
ax2.plot(xs2, kde3(xs2), 'k-', label="Scott's Rule")
ax2.plot(xs2, kde4(xs2), 'b-', label="Silverman's Rule")
ax2.plot(xs2, stats.norm.pdf(xs2,mean2,std2), 'r--', label="Normal PDF")ax2.set_xlabel('x')
ax2.set_ylabel('Density')plt.show()

My result: results

And for reference, the data going in first and 2nd moments:

print std1
print mean1
print std2 
print mean2
4.66416718334
0.0561365678347
0.0219996729055
0.00027330546845

Further, if I change the 2nd chart to produce a lognormal PDF, I get a flat line (which, if the Y-axis was correctly scaled like the top, I'm sure would show a distribution like I'd expect)

Answer

The result of a kernel density estimate is a probability density. While probability can't be larger than 1, a density can.

Given a probability density curve, you can find the probability within a range (x_1, x_2) by integrating the probability density in that range. Judging by eye, the integral under both your curves is approximately 1, so the output appears to be correct.

https://en.xdnf.cn/q/117772.html

Related Q&A

PyQt QFileDialog custom proxy filter not working

This working code brings up a QFileDialog prompting the user to select a .csv file:def load(self,fileName=None):if not fileName:fileName=fileDialog.getOpenFileName(caption="Load Existing Radio Log…

If I have Pandas installed correctly, why wont my import statement recognize it?

Im working on a project to play around with a csv file, however, I cant get pandas to work. Everything I have researched so far has just told me to make sure that pandas is installed. Using pip I have …

Python Issues with a Class

I am having issues with my below class. I keep getting the below traceback, butI am not sure were I am going wrong. I am expecting to see a dictionary with photo tags. Any help would be great. Tracebac…

Dynamically populate drop down menu with selection from previous drop down menu

I have a cgi script written in Python. There are two drop down menus and then a submit button. Id like to be able to make a selection from the first menu, and based off that choice, have the second dro…

Web Scrape page with multiple sections

Pretty new to python... and Im trying to my hands at my first project.Been able to replicate few simple demo... but i think there are few extra complexities with what Im trying to do.Im trying to scrap…

Python recv Loop

I am try to display data that is sent over a socket via an iteration from a loop. The way I have it at the moment isnt working on the Admin client. What should I do to fix my loop? Thank youAdmin t…

gtk+ python entry color [closed]

Its difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying thi…

converting a text corpus to a text document with vocabulary_id and respective tfidf score

I have a text corpus with say 5 documents, every document is separated with each other by /n. I want to provide an id to every word in the document and calculate its respective tfidf score. for example…

Numpy append array isnt working

Why isnt it appending all the lists? test = {file1:{subfile1:[1,2,3],subfile2:[10,11,12]},file5:{subfile1:[4,678,6]},file2:{subfile1:[4,78,6]},file3:{subfile1:[7,8,9]}} testarray = np.array([50,60,70]…

Select a valid choice ModelChoiceField

Whenever im running form.is_valid() i get the error: Select a valid choice. That choice is not one of the availablechoices.Here is what I do in my view:timeframes = HostTimeFrame.objects.all() if reque…