An efficient way to calculate the mean of each column or row of non-zero elements

2024/9/29 19:15:11

I have a numpy array for ratings given by users on movies. The rating is between 1 and 5, while 0 means that a user does not rate on a movie. I want to calculate the average rating of each movie, and the average rating of each user. In other words, I will calculate the mean of each column or row of non-zero elements.

Is there an efficient numpy array function to handle this case? I know manually iterating ratings by columns or rows can solve the problem.

Thanks in advance!

Answer

Since the values to discard are 0, you can compute the mean manually by doing the sum along an axis and then dividing by the number of non zeros elements (along the same axis):

a = np.array([[8.,9,7,0], [0,0,5,6]])
a.sum(1)/(a != 0).sum(1)

results in:

array([ 8. ,  5.5])

as you can see, the zeros are not considered in the mean.

https://en.xdnf.cn/q/71169.html

Related Q&A

Selecting unique observations in a pandas data frame

I have a pandas data frame with a column uniqueid. I would like to remove all duplicates from the data frame based on this column, such that all remaining observations are unique.

GEdit/Python execution plugin?

Im just starting out learning python with GEdit plus various plugins as my IDE.Visual Studio/F# has a feature which permits the highlighting on a piece of text in the code window which then, on a keyp…

autoclass and instance attributes

According to the sphinx documentation, the .. autoattribute directive should be able to document instance attributes. However, if I do::.. currentmodule:: xml.etree.ElementTree.. autoclass:: ElementTre…

python sqlite3 update not updating

Question: Why is this sqlite3 statement not updating the record?Info:cur.execute(UPDATE workunits SET Completed=1 AND Returns=(?) WHERE PID=(?) AND Args=(?),(pickle.dumps(Ret),PID,Args))Im using py…

Unable to reinstall PyTables for Python 2.7

I am installing Python 2.7 in addition to 2.7. When installing PyTables again for 2.7, I get this error -Found numpy 1.5.1 package installed. .. ERROR:: Could not find a local HDF5 installation. You ma…

How can I invoke a thread multiple times in Python?

Im sorry if it is a stupid question. I am trying to use a number of classes of multi-threading to finish different jobs, which involves invoking these multi-threadings at different times for many times…

Matplotlib interactive graph embedded in PyQt

Ive created a simple python script that when run should display an embedded matplotlib graph inside a PyQT window. Ive used this tutorial for embedding and running the graph. Aside from some difference…

How to pass path names to Python script by dropping files/folders over script icon

I am working in Mac OS X and have been writing simple file/folder copy scripts in Python. Is there a way to drag and drop a folder on top of a Python script icon and pass the file or folders path as an…

Why wouldnt I want to add Python.exe to my System Path at install time?

Im reinstalling Python, on Windows 7, and one of the first dialog boxes is the Customize Python screen.The default setting for "Add Python.exe to Path" is "Entire feature will be unavail…

How to install python-gtk2, python-webkit and python-jswebkit on OSX

Ive read through many of the related questions but am still unclear how to do this as there are many software combinations available and many solutions seem outdated.What is the best way to install the…