Numpy vectorisation of python object array

2024/10/7 2:32:46

Just a short question that I can't find the answer to before i head off for the day,

When i do something like this:

v1 = float_list_python = ... # <some list of floats>
v2 = float_array_NumPy = ... # <some numpy.ndarray of floats># I guess they don't have to be floats - # but some object that also has a native # object in C, so that numpy can just use# that

If i want to multiply these vectors by a scalar, my understanding has always been that the python list is a list of object references, and so looping through the list to do the multiplication must fetch the locations of all the floats, and then must get the floats in order to do it - which is one of the reasons it's slow.

If i do the same thing in NumPy, then, well, i'm not sure what happens. There are a number of things i imagine could happen:

  1. It splits the multpilication up across the cores.
  2. It vectorises the multications (as well?)

The documentation i've found suggests that many of the primitives in numpy take advantage of the first option there whenever they can (i don't have a computer on hand at the moment i can test it on). And my intuition tells me that number 2 should happen whenever it's possible.

So my question is, if I create a NumPy array of python objects, will it still at least perform operations on the list in parallel? I know that if you create an array of objects that have native C types, then it will actually create a contiguous array in memory of the actual objects, and that if you create an numpy array of python objects it will create an array of references, but i don't see why this would rule out parallel operations on said list, and cannot find anywhere that explicitly states that.

EDIT: I feel there's a bit of confusion over what i'm asking. I understand what vectorisation is, I understand that it is a compiler optimisation, and not something you necesarily program in (though aligning the data such that it's contiguous in memory is important). On the grounds of vectorisation, all i wanted to know was whether or not numpy uses it. If i do something like np_array1 * np_array2 does the underlying library call use vectorisation (presuming that dtype is a compatible type).

For the splitting up over the cores, all i mean there, is if i again do something like np_array1 * np_array2, but this time dtype=object: would it divide that work up amongst there cores?

Answer

numpy is fast because it performs numeric operations like this in fast compiled C code. In contrast the list operation operates at the interpreted Python level (streamlined as much as possible with Python bytecodes etc).

A numpy array of numeric type stores those numbers in a data buffer. At least in the simple cases this is just a block of bytes that C code can step through efficiently. The array also has shape and strides information that allows multidimensional access.

When you multiply the array by a scalar, it, in effect, calls a C function titled something like 'multiply_array_by_scalar', which does the multiplication in fast compiled code. So this kind of numpy operation is fast (compared to Python list code) regardless of the number of cores or other multi-processing/threading enhancements.

Arrays of objects do not have any special speed advantage (compared to lists), at least not at this time.

Look at my answer to a question about creating an array of arrays, https://stackoverflow.com/a/28284526/901925 I had to use iteration to initialize the values.

Have you done any time experiments? For example, construct an array, say (1000,2). Use tolist() to create an equivalent list of lists. And make a similar array of objects, with each object being a (2,) array or list (how much work did that take?). Now do something simple like len(x) for each of those sub lists.

https://en.xdnf.cn/q/118881.html

Related Q&A

Python user must only enter float numbers

I am trying to find out how to make it so the user [only enters numbers <0] and [no letters] allowed. Can someone show me how I would set this up. I have tried to set up try/catch blocks but I keep …

Django 1.7: some_name() takes exactly 2 arguments (1 given)

this is my view.pyfrom django.http import HttpResponse import datetime def current_datetime(request):now = datetime.datetime.now()html = "<html><body>It is now %s.</body></htm…

Solving Linear equations with constraint in Python

I have a system of linear equations with some constraints. I would appreciate it if someone could help me solving this system of equations in Python.

systemd service keep giving me error when start or get status

I have a python application and I need it to be run as a service, I tried many methods and I was advised to make it as systemd service I searched and tried some code here is my unit code [Unit] Descrip…

Python 3.5: Print Canvas Text

Could anyone share with me how to print the text of the text widget added to a Canvas object? In the code below, I want the system return the value of "hello" when mouse on the text, however…

What is the most efficient way to match keys from a dictionary to data in text file

Say I have the following dictionary:data=[a 1 : A, b 2 : B, c 3 : C, d 4 : D]and a .txt file which reads:Key a 1 b 2 c 3 d 4 Word as box cow dig(note values are seperated by \t TAB char…

Converting an excel file to a specific Json in python using openpyxl library

I have the Excel data with the format shown in the image preview. How can I convert it into a JSON using Python? Expected Output: file_name = [ { A: Measurement( time=10, X1=1, X2=4 ), B: Measurement(…

Why this algorithm can sort data in descending order

I study python programming and try to sort data in descending order.#sort1 below is successfully sorted but I cannot understand why this happen. Also, data[i], data[data.index(mn)] = data[data.index(m…

Processing.py - Unknown Error on Class Definition

I have no idea how to fix this error. Maybe theres an open parenthesis or quotation mark somewhere before this line?What is wrong with this code?Class Ribbon: # I got an error on this line! def __ini…

How can I implement a stopwatch in my socket Python program

I am creating a very simple ping program in Python that will send 16- 64 bytes of information to a local server, once the server has received all the bytes, it will send back a 1 Byte message back to t…