When numba is effective?

2024/10/18 15:13:00

I know numba creates some overheads and in some situations (non-intensive computation) it become slower that pure python. But what I don't know is where to draw the line. Is it possible to use order of algorithm complexity to figure out where?

for example for adding two arrays (~O(n)) shorter that 5 in this code pure python is faster:

def sum_1(a,b):result = 0.0for i,j in zip(a,b):result += (i+j)return result@numba.jit('float64[:](float64[:],float64[:])')
def sum_2(a,b):result = 0.0for i,j in zip(a,b):result += (i+j)return result# try 100
a = np.linspace(1.0,2.0,5)
b = np.linspace(1.0,2.0,5)
print("pure python: ")
%timeit -o sum_1(a,b)
print("\n\n\n\npython + numba: ")
%timeit -o sum_2(a,b)

UPDADE: what I am looking for is a similar guideline like here:

"A general guideline is to choose different targets for different data sizes and algorithms. The “cpu” target works well for small data sizes (approx. less than 1KB) and low compute intensity algorithms. It has the least amount of overhead. The “parallel” target works well for medium data sizes (approx. less than 1MB). Threading adds a small delay. The “cuda” target works well for big data sizes (approx. greater than 1MB) and high compute intensity algorithms. Transfering memory to and from the GPU adds significant overhead."

Answer

It's hard to draw the line when numba becomes effective. However there are a few indicators when it might not be effective:

  • If you cannot use jit with nopython=True - whenever you cannot compile it in nopython mode you either try to compile too much or it won't be significantly faster.

  • If you don't use arrays - When you deal with lists or other types that you pass to the numba function (except from other numba functions), numba needs to copy these which incurs a significant overhead.

  • If there is already a NumPy or SciPy function that does it - even if numba can be significantly faster for short arrays it will almost always be as fast for longer arrays (also you might easily neglect some common edge cases that these would handle).

There's also another reason why you might not want to use numba in cases where it's just "a bit" faster than other solutions: Numba functions have to be compiled, either ahead-of-time or when first called, in some situations the compilation will be much slower than your gain, even if you call it hundreds of times. Also the compilation times add up: numba is slow to import and compiling the numba functions also adds some overhead. It doesn't make sense to shave off a few milliseconds if the import overhead increased by 1-10 seconds.

Also numba is complicated to install (without conda at least) so if you want to share your code then you have a really "heavy dependency".


Your example is lacking a comparison with NumPy methods and a highly optimized version of pure Python. I added some more comparison functions and did a benchmark (using my library simple_benchmark):

import numpy as np
import numba as nb
from itertools import chaindef python_loop(a,b):result = 0.0for i,j in zip(a,b):result += (i+j)return result@nb.njit
def numba_loop(a,b):result = 0.0for i,j in zip(a,b):result += (i+j)return resultdef numpy_methods(a, b):return a.sum() + b.sum()def python_sum(a, b):return sum(chain(a.tolist(), b.tolist()))from simple_benchmark import benchmark, MultiArgumentarguments = {2**i: MultiArgument([np.zeros(2**i), np.zeros(2**i)])for i in range(2, 17)
}
b = benchmark([python_loop, numba_loop, numpy_methods, python_sum], arguments, warmups=[numba_loop])%matplotlib notebook
b.plot()

enter image description here

Yes, the numba function is fastest for small arrays, however the NumPy solution will be slightly faster for longer arrays. The Python solutions are slower but the "faster" alternative is already significantly faster than your original proposed solution.

In this case I would simply use the NumPy solution because it's short, readable and fast, except when you're dealing with lots of short arrays and call the function a lot of times - then the numba solution would be significantly better.

https://en.xdnf.cn/q/72581.html

Related Q&A

How to launch a Windows shortcut using Python

I want to launch a shortcut named blender.ink located at "D://games//blender.ink". I have tryed using:-os.startfile ("D://games//blender.ink")But it failed, it only launches exe fil…

Providing context in TriggerDagRunOperator

I have a dag that has been triggered by another dag. I have passed through to this dag some configuration variables via the DagRunOrder().payload dictionary in the same way the official example has don…

Install gstreamer support for opencv python package

I have built my own opencv python package from source. import cv2 print(cv2.__version__)prints: 3.4.5Now the issue I am facing is regarding the use of gstreamer from the VideoCapture class of opencv. I…

How to use query function with bool in python pandas?

Im trying to do something like df.query("column == a").count()but withdf.query("column == False").count()What is the right way of using query with a bool column?

Text Extraction from image after detecting text region with contours

I want to build an OCR for an image using machine learning in python. I have preprocessed image by converting it to grayscale , applied otsu thresholding . I then used the contours to find the text re…

Pyinstaller executable fails importing torchvision

This is my main.py:import torchvision input("Press key")It runs correctly in the command line: python main.pyI need an executable for windows. So I did : pyinstaller main.pyBut when I launche…

Embedding Python in C: Having problems importing local modules

I need to run Python scripts within a C-based app. I am able to import standard modules from the Python libraries i.e.:PyRun_SimpleString("import sys")But when I try to import a local module …

Primitive Calculator - Dynamic Approach

Im having some trouble getting the correct solution for the following problem: Your goal is given a positive integer n, find the minimum number ofoperations needed to obtain the number n starting from …

Cant pickle : attribute lookup builtin.function failed

Im getting the error below, the error only happens when I add delay to process_upload function, otherwise it works without a problem.Could someone explain what this error is, why its happening and any…

Pandas Merge two DataFrames without some columns

ContextIm trying to merge two big CSV files together.ProblemLets say Ive one Pandas DataFrame like the following...EntityNum foo ... ------------------------ 1001.01 100 1002.02 50 1003…