Get the number of nonzero elements in a numpy array?

2024/11/15 0:01:00

Is it possible to get the length of the nonzero elements in a numpy array without iterating over the array or masking the array. Speed is the main goal of calculating the length.

Essentially, something like len(array).where(array != 0).

If it changes the answer, each row will begin with zeros. The array is filled on the diagonal with zeros.

Answer

Assuming you mean total number of nonzero elements (and not total number of nonzero rows):

In [12]: a = np.random.randint(0, 3, size=(100,100))In [13]: timeit len(a.nonzero()[0])
1000 loops, best of 3: 306 us per loopIn [14]: timeit (a != 0).sum()
10000 loops, best of 3: 46 us per loop

or even better:

In [22]: timeit np.count_nonzero(a)
10000 loops, best of 3: 39 us per loop

This last one, count_nonzero, seems to behave well when the array is small, too, whereas the sum trick not so much:

In [33]: a = np.random.randint(0, 3, size=(10,10))In [34]: timeit len(a.nonzero()[0])
100000 loops, best of 3: 6.18 us per loopIn [35]: timeit (a != 0).sum()
100000 loops, best of 3: 13.5 us per loopIn [36]: timeit np.count_nonzero(a)
1000000 loops, best of 3: 686 ns per loop
https://en.xdnf.cn/q/72113.html

Related Q&A

Pytest on Python Tools for visual studio

Can debug python tests which are using pytest library on visual studio 2010 ? I added the -m pytest on the Interpreter arguments but the breakpoints are not hit, I can only run the test script without…

Python Paramiko directory walk over SFTP

How to do os.walk() but on another computer through SSH? The problem is that os.walk() executes on a local machine and I want to ssh to another host, walk through a directory and generate MD5 hashes f…

Python 2.7 32-bit install on Win 7: No registry keys?

I have downloaded the Python 2.7.2 Windows x86 32-bit MSI from python.org and installed it on a 64-bit Windows 7 system. Everything works (at least the command-line interpreter starts and runs), but t…

i18n with jinja2 + GAE

I googled for a GAE + jinja i18n example but could not find it. Can anyone provide a link or working example?My effort uses the django translations and I dont know if this is the recommend way of doin…

Interpolating one time series onto another in pandas

I have one set of values measured at regular times. Say:import pandas as pd import numpy as np rng = pd.date_range(2013-01-01, periods=12, freq=H) data = pd.Series(np.random.randn(len(rng)), index=rng)…

Reference class variable in a comprehension of another class variable

This may be a simple question, but Im having trouble making a unique search for it. I have a class that defines a static dictionary, then attempts to define a subset of that dictionary, also statically…

Pyspark module not found

Im trying to execute a simple Pyspark job in Yarn. This is the code:from pyspark import SparkConf, SparkContextconf = (SparkConf().setMaster("yarn-client").setAppName("HDFS Filter")…

Multiple windows in PyQt4?

Ive just begun using pyqt4. I followed a tutorial (http://zetcode.com/tutorials/pyqt4/) One thing that puzzles me is this part:def main():app = QtGui.QApplication(sys.argv)ex = GUI()sys.exit(app.exec()…

Fill missing timeseries data using pandas or numpy

I have a list of dictionaries which looks like this :L=[ { "timeline": "2014-10", "total_prescriptions": 17 }, { "timeline": "2014-11", "total_…

Can Biopython perform Seq.find() accounting for ambiguity codes

I want to be able to search a Seq object for a subsequnce Seq object accounting for ambiguity codes. For example, the following should be true:from Bio.Seq import Seq from Bio.Alphabet.IUPAC import IUP…