weighted covariance matrix in numpy

2024/9/21 3:15:52

I want to compute the covariance C of n measurements of p quantities, where each individual quantity measurement is given its own weight. That is, my weight array W has the same shape as my quantity array Q (n by p). The native np.cov() function only supports weights given to individual measurements (i.e., a vector of length n).

I can initialize a p by p matrix and iterate, but if p is large, then it's a very slow process.

Since Q is known to have mean zero for each quantity (column of Q), the explicit formula for each element of C is

C[i,j] = np.sum(Q[:, i] * Q[:, j] * W[:, i] * W[:, j]) / np.sum(W[:, i] * W[:, j])

If I rearrange the numerator to be Q[:, i] * W[:, i] * Q[:, j] * W[:, j], it seems like I should be able to multiply and sum columns of Q * W, and then do the denominator similarly (except using W * W).

Is there a way to do this with np.einsum()?

For testing, let's define the following:

C = array([[ 1.  ,  0.1 ,  0.2 ], # set this beforehand, to test whether [ 0.1 ,  0.5 ,  0.15], # we get the correct result[ 0.2 ,  0.15,  0.75]])Q = array([[-0.6084634 ,  0.16656143, -1.04490324],[-1.51164337, -0.96403094, -2.37051952],[-0.32781346, -0.19616374, -1.32591578],[-0.88371729,  0.20877833, -0.52074272],[-0.67987913, -0.84458226,  0.02897935],[-2.01924756, -0.51877396, -0.68483981],[ 1.64600477,  0.67620595,  1.24559591],[ 0.82554885,  0.14884613, -0.15211434],[-0.88119527,  0.11663335, -0.31522598],[-0.14830668,  1.26906561, -0.49686309]])W = array([[ 1.01133857,  0.91962164,  1.01897898],[ 1.09467975,  0.91191381,  0.90150961],[ 0.96334661,  1.00759046,  1.01638749],[ 1.04827001,  0.95861001,  1.01248969],[ 0.91572506,  1.09388218,  1.03616461],[ 0.9418178 ,  1.07210878,  0.90431879],[ 1.0093642 ,  1.00408472,  1.07570172],[ 0.92203074,  1.00022631,  1.09705542],[ 0.99775598,  0.01000000,  0.94996408],[ 1.02996389,  1.01224303,  1.00331465]])
Answer

You can use the very efficient matrix-multiplication with np.dot -

QW = Q*W
C = QW.T.dot(QW)/W.T.dot(W)
https://en.xdnf.cn/q/72093.html

Related Q&A

How to implement biased random function?

My question is about random.choice function. As we know, when we run random.choice([apple,banana]), it will return either apple or banana with equal probabilities, what if I want to return biased resul…

Finding minimum value for each level of a multi-index dataframe

I have a DataFrame that looks like this:data a b 1 1 0.12 0.23 0.3 2 1 0.52 0.63 0.7and I want to find the minimum value for each level of a ignoring the b level, so as an output Im l…

Count occurrences of a list of substrings in a pyspark df column

I want to count the occurrences of list of substrings and create a column based on a column in the pyspark df which contains a long string.Input: ID History1 USA|UK|IND|DEN|MAL|SWE|AUS2…

What are screen units in tkinter?

I was reading the response in the link below and ran into screen units but I couldnt find what exactly was referred to by screen units in Jim Denneys response. I know they are not pixels. How do I use …

Python SUMPRODUCT of elements in nested list

I have two nested lists: a = [[1,2,3],[2,4,2]] b = [[5,5,5],[1,1,1]]I want to multiply and SUMPRODUCT each group of elements to get c = [[30],[8]]Which result from = [[1*5+2*5+3*5],[2*1,4*1,2*1]] Ive t…

Modifying viridis colormap (replacing some colors)

Ive searched around and found things that came close to working but nothing exactly suiting what I need. Basically, I really like the viridis colormap as a starting point. However, I would like to repl…

gtk minimum size

Is there an easy way to request that a GTK widget have a minimum width/height? I know you can do it on the column of a TreeView, but is it available for general widgets?

How do I convert a json file to a python class?

Consider this json file named h.json I want to convert this into a python dataclass. {"acc1":{"email":"[email protected]","password":"acc1","name&…

PyTorch how to compute second order Jacobian?

I have a neural network thats computing a vector quantity u. Id like to compute first and second-order jacobians with respect to the input x, a single element. Would anybody know how to do that in PyTo…

Tensorflow setup on RStudio/ R | CentOS

For the last 5 days, I am trying to make Keras/Tensorflow packages work in R. I am using RStudio for installation and have used conda, miniconda, virtualenv but it crashes each time in the end. Install…