Reasons of slowness in numpy.dot() function and how to mitigate them if custom classes are used?

2024/10/14 3:18:44

I am profiling a numpy dot product call.

numpy.dot(pseudo,pseudo)

pseudo is a numpy array of custom objects. Defined as:

pseudo = numpy.array([[PseudoBinary(1), PseudoBinary(0), PseudoBinary(1)],[PseudoBinary(1), PseudoBinary(0), PseudoBinary(0)],[PseudoBinary(1), PseudoBinary(0), PseudoBinary(1)]])

PseudoBinary is a class that has a custom multiply function. It ORs instead of multiplying. See below for the complete code of PseudoBinary definition.

Type:

(Pdb) pseudo.dtype
dtype('O')

According to my profiling, the pseudo dot product is about 500 times slower than a dot product using matrixes with integer values. Pointer to the profiling code is given below.

I am interested in the reasons of the slowness and if there are ways to mitigate them.

Some of the reasons of the slowness may be:

  • The memory layout of pseudo would not use contiguous memory. According to this, numpy uses pointers with object types. During matrix multiplication, bunch of pointer dereferences may occur instead of directly reading from contiguous memory.

  • Numpy multiplication may not use the optimized internal compiled implementations. (BLAS, ATLAS etc.) According to this, various conditions should hold for falling back to the optimized implementation. Using custom objects may break those.

Are there other factors in play? Any recommendations for improvement?

The starting point of all this was this question. There, the OP is looking for a “custom dot product”. An operation that visits the elements of two matrices similar to the dot product operation, but does something else than multiplying the corresponding elements of columns and rows. In an answer, I recommended a custom object that overwrites the __mul__ function. But the numpy.dot performance is very slow with that approach. The code that does the performance measurement can be seen in that answer too.


Code showing the PseudoBinary class and dot product execution.

#!/usr/bin/env pythonfrom __future__ import absolute_importfrom __future__ import print_functionimport numpyclass PseudoBinary(object):def __init__(self,i):self.i = idef __mul__(self,rhs):return PseudoBinary(self.i or rhs.i)__rmul__ = __mul____imul__ = __mul__def __add__(self,rhs):return PseudoBinary(self.i + rhs.i)__radd__ = __add____iadd__ = __add__def __str__(self):return "P"+str(self.i)__repr__ = __str__base = numpy.array([[1, 0, 1],[1, 0, 0],[1, 0, 1]])pseudo = numpy.array([[PseudoBinary(1), PseudoBinary(0), PseudoBinary(1)],[PseudoBinary(1), PseudoBinary(0), PseudoBinary(0)],[PseudoBinary(1), PseudoBinary(0), PseudoBinary(1)]])baseRes = numpy.dot(base,base)pseudoRes = numpy.dot(pseudo,pseudo)print("baseRes\n",baseRes)print("pseudoRes\n",pseudoRes)

Prints :

baseRes[[2 0 2][1 0 1][2 0 2]]
pseudoRes[[P3 P2 P2][P3 P1 P2][P3 P2 P2]]
Answer

Pretty much anything you do with object arrays is going to be slow. None of the reasons NumPy is usually fast apply to object arrays.

  • Object arrays cannot store their elements contiguously. They must store and dereference pointers.
    • They don't know how much space they would have to allocate for their elements.
    • Their elements may not all be the same size.
    • The elements you insert into an object array have already been allocated outside the array, and they cannot be copied.
  • Object arrays must perform dynamic dispatch on all element operations. Every time they add or multiply two elements, they have to figure out how to do that all over again.
  • Object arrays have no way to accelerate the implementation of their elements, such as your slow, interpreted __add__ and __mul__.
  • Object arrays cannot avoid the memory allocation associated with their element operations, such as the allocation of a new PseudoBinary object and a new __dict__ for that object on every element __add__ or __mul__.
  • Object arrays cannot parallelize operations, as all operations on their elements will require the GIL to be held.
  • Object arrays cannot use LAPACK or BLAS, as there are no LAPACK or BLAS functions for arbitrary Python datatypes.
  • Etc.

Basically, every reason doing Python math without NumPy is slow also applies to doing anything with object arrays.


As for how to improve your performance? Don't use object arrays. Use regular arrays, and either find a way to implement the thing you want in terms of the operations NumPy provides, or write out the loops explicitly and use something like Numba or Cython to compile your code.

https://en.xdnf.cn/q/118002.html

Related Q&A

How to open cmd and run ipconfig in python

I would like to write a script that do something like that: open the cmd and run the commend "ipconfig" and than copy my ip and paste it to a text file. I wrote the beginning of the script …

Using OAuth to authenticate Office 365/Graph users with Django

We are creating an application for use in our organization, but we only want people in our organization to be able to use the app. We had the idea of using Microsofts OAuth endpoint in order to authent…

Python flatten array inside numpy array

I have a pretty stupid question, but for some reason, I just cant figure out what to do. I have a multi-dimensional numpy array, that should have the following shape:(345138, 30, 300)However, it actual…

Peewee and Flask : Database object has no attribute commit_select

Im trying to use Peewee with Flask, but I dont understand why my database connection does not work.config.pyclass Configuration(object): DATABASE = {name: test,engine: peewee.MySQLDatabase,user: root,p…

for loop to create a matrix in python

I am trying to study the probability of having a zero value in my data and I have developed a code that outputs the value of a column of data when the other is zero which is what I need. But having to …

How to convert List of JSON frames to JSON frame

I want to convert List of JSON object ot Single JSON frameHere is my codefor i in user1:name=i.namepassword=i.passwordid1=i.iduser = { "name" : name,"password" : password,"id&q…

Python 2.7 The packaging package is required; normally this is bundled with this package

I expect this has to do with the cryptography module, but Im not sure.Traceback (most recent call last):File "<string>", line 11, in <module>File "c:\python27\lib\site-packag…

Couple the data in all possible combinations

I have data in column in two columns like thisId Value 1 a 2 f 1 c 1 h 2 aand Id like couple the data of the Value column in all possible combinations based on the same Id such as(a,c) (a,h)…

Python - Find date from string

Would anyone know a regex string or another method of obtaining the date and time from this string into variables? The position of the string could change, so line and char no would not work. This is …

Get 1st column values on .csv file on python

i am newbie at python programming, i have a .csv file containing mac address and signal strength data from an AP consider my csv data is:i want to get just mac address values which is the 1st row, ref…