Numpy: fast calculations considering items neighbors and their position inside the array

2024/9/19 9:54:32

I have 4 2D numpy arrays, called a, b, c, d, each of them made of n rows and m columns. What I need to do is giving to each element of b and d a value calculated as follows (pseudo-code):

min_coords = min_of_neighbors_coords(x, y)
b[x,y] = a[x,y] * a[min_coords];
d[x,y] = c[min_coords];

Where min_of_neighbors_coords is a function that, given the coordinates of an element of the array, returns the coordinates of the 'neighbor' element that has the lower value. I.e., considering the array:

1, 2, 5
3, 7, 2
2, 3, 6

min_of_neighbors_coords(1, 1) will refer to the central element with the value of 7, and will return the tuple (0, 0): the coordinates of the number 1.

I managed to do this using for loops (element per element), but the algorithm is VERY slow and I'm searching a way to improve it, avoiding loops and demanding the calculations to numpy.

Is it possible?

Answer

EDIT I have kept my original answer at the bottom. As Paul points out in the comments, the original answer didn't really answer the OP's question, and could be more easily achieved with an ndimage filter. The following much more cumbersome function should do the right thing. It takes two arrays, a and c, and returns the windowed minimum of a and the values in c at the positions of the windowed minimums in a:

def neighbor_min(a, c):ac = np.concatenate((a[None], c[None]))rows, cols = ac.shape[1:]ret = np.empty_like(ac)# Fill in the centerwin_ac = as_strided(ac, shape=(2, rows-2, cols, 3),strides=ac.strides+ac.strides[1:2])win_ac = win_ac[np.ogrid[:2, :rows-2, :cols] +[np.argmin(win_ac[0], axis=2)]]win_ac = as_strided(win_ac, shape=(2, rows-2, cols-2, 3),strides=win_ac.strides+win_ac.strides[2:3])ret[:, 1:-1, 1:-1] =  win_ac[np.ogrid[:2, :rows-2, :cols-2] +[np.argmin(win_ac[0], axis=2)]]# Fill the top, bottom, left and right borderswin_ac = as_strided(ac[:, :2, :], shape=(2, 2, cols-2, 3),strides=ac.strides+ac.strides[2:3])win_ac = win_ac[np.ogrid[:2, :2, :cols-2] +[np.argmin(win_ac[0], axis=2)]]ret[:, 0, 1:-1] = win_ac[:, np.argmin(win_ac[0], axis=0),np.ogrid[:cols-2]]win_ac = as_strided(ac[:, -2:, :], shape=(2, 2, cols-2, 3),strides=ac.strides+ac.strides[2:3])win_ac = win_ac[np.ogrid[:2, :2, :cols-2] +[np.argmin(win_ac[0], axis=2)]]ret[:, -1, 1:-1] = win_ac[:, np.argmin(win_ac[0], axis=0),np.ogrid[:cols-2]]win_ac = as_strided(ac[:, :, :2], shape=(2, rows-2, 2, 3),strides=ac.strides+ac.strides[1:2])win_ac = win_ac[np.ogrid[:2, :rows-2, :2] +[np.argmin(win_ac[0], axis=2)]]ret[:, 1:-1, 0] = win_ac[:, np.ogrid[:rows-2],np.argmin(win_ac[0], axis=1)]win_ac = as_strided(ac[:, :, -2:], shape=(2, rows-2, 2, 3),strides=ac.strides+ac.strides[1:2])win_ac = win_ac[np.ogrid[:2, :rows-2, :2] +[np.argmin(win_ac[0], axis=2)]]ret[:, 1:-1, -1] = win_ac[:, np.ogrid[:rows-2],np.argmin(win_ac[0], axis=1)]# Fill the cornerswin_ac = ac[:, :2, :2]win_ac = win_ac[:, np.ogrid[:2],np.argmin(win_ac[0], axis=-1)]ret[:, 0, 0] = win_ac[:, np.argmin(win_ac[0], axis=-1)]win_ac = ac[:, :2, -2:]win_ac = win_ac[:, np.ogrid[:2],np.argmin(win_ac[0], axis=-1)]ret[:, 0, -1] = win_ac[:, np.argmin(win_ac[0], axis=-1)]win_ac = ac[:, -2:, -2:]win_ac = win_ac[:, np.ogrid[:2],np.argmin(win_ac[0], axis=-1)]ret[:, -1, -1] = win_ac[:, np.argmin(win_ac[0], axis=-1)]win_ac = ac[:, -2:, :2]win_ac = win_ac[:, np.ogrid[:2],np.argmin(win_ac[0], axis=-1)]ret[:, -1, 0] = win_ac[:, np.argmin(win_ac[0], axis=-1)]return ret

The return is a (2, rows, cols) array that can be unpacked into the two arrays:

>>> a = np.random.randint(100, size=(5,5))
>>> c = np.random.randint(100, size=(5,5))
>>> a
array([[42, 54, 18, 88, 26],[80, 65, 83, 31,  4],[51, 52, 18, 88, 52],[ 1, 70,  5,  0, 89],[47, 34, 27, 67, 68]])
>>> c
array([[94, 94, 29,  6, 76],[81, 47, 67, 21, 26],[44, 92, 20, 32, 90],[81, 25, 32, 68, 25],[49, 43, 71, 79, 77]])
>>> neighbor_min(a, c)
array([[[42, 18, 18,  4,  4],[42, 18, 18,  4,  4],[ 1,  1,  0,  0,  0],[ 1,  1,  0,  0,  0],[ 1,  1,  0,  0,  0]],[[94, 29, 29, 26, 26],[94, 29, 29, 26, 26],[81, 81, 68, 68, 68],[81, 81, 68, 68, 68],[81, 81, 68, 68, 68]]])

The OP's case could then be solved as:

def bd_from_ac(a, c):b,d = neighbor_min(a, c)return a*b, d

And while there is a serious performance hit, it is pretty fast still:

In [3]: a = np.random.rand(1000, 1000)In [4]: c = np.random.rand(1000, 1000)In [5]: %timeit bd_from_ac(a, c)
1 loops, best of 3: 570 ms per loop

You are not really using the coordinates of the minimum neighboring element for anything else than fetching it, so you may as well skip that part and create a min_neighbor function. If you don't want to resort to cython for fast looping, you are going to have to go with rolling window views, such as outlined in Paul's link. This will typically convert your (m, n) array into a (m-2, n-2, 3, 3) view of the same data, and you would then apply np.min over the last two axes.

Unfortunately you have to apply it one axis at a time, so you will have to create a (m-2, n-2, 3) copy of your data. Fortunately, you can compute the minimum in two steps, first windowing and minimizing along one axis, then along the other, and obtain the same result. So at most you are going to have intermediate storage the size of your input. If needed, you could even reuse the output array as intermediate storage and avoid memory allocations, but that is left as exercise...

The following function does that. It is kind of lengthy because it has to deal not only with the central area, but also with the special cases of the four edges and four corners. Other than that it is a pretty compact implementation:

def neighbor_min(a):rows, cols = a.shaperet = np.empty_like(a)# Fill in the centerwin_a = as_strided(a, shape=(m-2, n, 3),strides=a.strides+a.strides[:1])win_a = win_a.min(axis=2)win_a = as_strided(win_a, shape=(m-2, n-2, 3),strides=win_a.strides+win_a.strides[1:])ret[1:-1, 1:-1] = win_a.min(axis=2)# Fill the top, bottom, left and right borderswin_a = as_strided(a[:2, :], shape=(2, cols-2, 3),strides=a.strides+a.strides[1:])ret[0, 1:-1] = win_a.min(axis=2).min(axis=0)win_a = as_strided(a[-2:, :], shape=(2, cols-2, 3),strides=a.strides+a.strides[1:])ret[-1, 1:-1] = win_a.min(axis=2).min(axis=0)win_a = as_strided(a[:, :2], shape=(rows-2, 2, 3),strides=a.strides+a.strides[:1])ret[1:-1, 0] = win_a.min(axis=2).min(axis=1)win_a = as_strided(a[:, -2:], shape=(rows-2, 2, 3),strides=a.strides+a.strides[:1])ret[1:-1, -1] = win_a.min(axis=2).min(axis=1)# Fill the cornersret[0, 0] = a[:2, :2].min()ret[0, -1] = a[:2, -2:].min()ret[-1, -1] = a[-2:, -2:].min()ret[-1, 0] = a[-2:, :2].min()return ret

You can now do things like:

>>> a = np.random.randint(10, size=(5, 5))
>>> a
array([[0, 3, 1, 8, 9],[7, 2, 7, 5, 7],[4, 2, 6, 1, 9],[2, 8, 1, 2, 3],[7, 7, 6, 8, 0]])
>>> neighbor_min(a)
array([[0, 0, 1, 1, 5],[0, 0, 1, 1, 1],[2, 1, 1, 1, 1],[2, 1, 1, 0, 0],[2, 1, 1, 0, 0]])

And your original question can be solved as:

def bd_from_ac(a, c):return a*neighbor_min(a), neighbor_min(c)

As a performance benchmark:

In [2]: m, n = 1000, 1000In [3]: a = np.random.rand(m, n)In [4]: c = np.random.rand(m, n)In [5]: %timeit bd_from_ac(a, c)
1 loops, best of 3: 123 ms per loop
https://en.xdnf.cn/q/72405.html

Related Q&A

How to see all the databases and Tables in Databricks

i want to list all the tables in every database in Azure Databricks. so i want the output to look somewhat like this: Database | Table_name Database1 | Table_1 Database1 | Table_2 Database1 | Table_3 D…

How to get transparent background in window with PyGTK and PyCairo?

Ive been trying really hard to create a window with no decoration and a transparent background using PyGTK. I would then draw the content of the window with Cairo. But I cant get it to work.Ive tried a…

concurrent.futures.ThreadPoolExecutor doesnt print errors

I am trying to use concurrent.futures.ThreadPoolExecutor module to run a class method in parallel, the simplified version of my code is pretty much the following: class TestClass:def __init__(self, sec…

How to write a Dictionary to Excel in Python

I have the following dictionary in python that represents a From - To Distance Matrix.graph = {A:{A:0,B:6,C:INF,D:6,E:7},B:{A:INF,B:0,C:5,D:INF,E:INF},C:{A:INF,B:INF,C:0,D:9,E:3},D:{A:INF,B:INF,C:9,D:0…

How can I check pooled connections in SQLAlchemy before handing them off to my application code?

We have a slightly unreliable database server, for various reasons, and as a consequence sometimes the database connections used by my application vanish out from under it. The connections are SQLAlch…

pandas list of dictionary to separate columns

I have a data set like below:name status number message matt active 12345 [job: , money: none, wife: none] james active 23456 [group: band, wife: yes, money: 10000] adam in…

Where is console input history stored on Python for Windows?

Good afternoon,The QuestionIs there a particular spot that the entries are stored, or is it just a local set of stored variables, for the windows version of Python?The ContextI am curious about where …

Matplotlib Animation: how to dynamically extend x limits?

I have a simple animation plot like so: import numpy as np from matplotlib import pyplot as plt from matplotlib import animation# First set up the figure, the axis, and the plot element we want to anim…

How to get round the HTTP Error 403: Forbidden with urllib.request using Python 3

Hi not every time but sometimes when trying to gain access to the LSE code I am thrown the every annoying HTTP Error 403: Forbidden message.Anyone know how I can overcome this issue only using standard…

Installing lxml in virtualenv for windows

Ive recently started using virtualenv, and would like to install lxml in this isolated environment.Normally I would use the windows binary installer, but I want to use lxml in this virtualenv (not glob…