Performance difference between filling existing numpy array and creating a new one

2024/10/18 15:43:16

In iterative algorithms, it is common to use large numpy arrays many times. Frequently the arrays need to be manually "reset" on each iteration. Is there a performance difference between filling an existing array (with nans or 0s) and creating a new array? If so, why?

Answer

The answer depends on the size of your arrays. While allocating a new memory region takes nearly a fixed amount of time, the time to fill this memory region grows linear with size. But, filling a new allocated memory with numpy.zeros is nearly twice as fast, as filling an existing array with numpy.fill, and three times faster than item setting x[:] = 0.

So on my machine, filling vectors with less than 800 elements is faster than creating new vectors, with more than 800 elements creating new vectors gets faster.

https://en.xdnf.cn/q/72430.html

Related Q&A

Set space between boxplots in Python Graphs generated nested box plots with Seaborn?

I am trying to set a space between the boxplots (between the green and orange boxes) created with Python Seaborn modules sns.boxplot(). Please see attached the graph, that the green and orange subplot …

Geocoding using Geopy and Python

I am trying to Geocode a CSV file that contains the name of the location and a parsed out address which includes Address number, Street name, city, zip, country. I want to use GEOPY and ArcGIS Geocodes…

Making Python scripts work with xargs

What would be the process of making my Python scripts work well with xargs? For instance, I would like the following command to work through each line of text file, and execute an arbitrary command:c…

TypeError: expected string or buffer in Google App Engines Python

I want to show the content of an object using the following code:def get(self):url="https://www.googleapis.com/language/translate/v2?key=MY-BILLING-KEY&q=hello&source=en&target=ja&quo…

Returning a row from a CSV, if specified value within the row matches condition

Ahoy, Im writing a Python script to filter some large CSV files.I only want to keep rows which meet my criteria.My input is a CSV file in the following formatLocus Total_Depth Average_Depth_sa…

Python multiprocessing pool: dynamically set number of processes during execution of tasks

We submit large CPU intensive jobs in Python 2.7 (that consist of many independent parallel processes) on our development machine which last for days at a time. The responsiveness of the machine slows …

TypeError: cant escape psycopg2.extensions.Binary to binary

I try to store binary file into postgresql through sqlalchemy and file is uploaded from client. A bit google on the error message brings me to this source file:" wrapped object is not bytes or a…

Keras: Cannot Import Name np_utils [duplicate]

This question already has answers here:ImportError: cannot import name np_utils(19 answers)Closed 6 years ago.Im using Python 2.7 and a Jupyter notebook to do some basic machine learning. Im following…

Python 3 string index lookup is O(1)?

Short story:Is Python 3 unicode string lookup O(1) or O(n)?Long story:Index lookup of a character in a C char array is constant time O(1) because we can with certainty jump to a contiguous memory loca…

Using PIL to detect a scan of a blank page

So I often run huge double-sided scan jobs on an unintelligent Canon multifunction, which leaves me with a huge folder of JPEGs. Am I insane to consider using PIL to analyze a folder of images to detec…