Numpy Array Set Difference [duplicate]

2024/10/12 15:33:15

I have two numpy arrays that have overlapping rows:

import numpy as npa = np.array([[1,2], [1,5], [3,4], [3,5], [4,1], [4,6]])
b = np.array([[1,5], [3,4], [4,6]])

You can assume that:

  1. the rows are sorted
  2. the rows within each array is unique
  3. array b is always subset of array a

I would like to get an array that contains all rows of a that are not in b.

i.e.,:

[[1 2][3 5][4 1]]

Considering that a and b can be very, very large, what is the most efficient method for solving this problem?

Answer

Here's a possible solution to your problem:

import numpy as npa = np.array([[1, 2], [3, 4], [3, 5], [4, 1], [4, 6]])
b = np.array([[3, 4], [4, 6]])a1_rows = a.view([('', a.dtype)] * a.shape[1])
a2_rows = b.view([('', b.dtype)] * b.shape[1])
c = np.setdiff1d(a1_rows, a2_rows).view(a.dtype).reshape(-1, a.shape[1])
print c

I think using numpy.setdiff1d is the right choice here

https://en.xdnf.cn/q/69636.html

Related Q&A

Pylint not working within Spyder

Ive installed Anaconda on a Windows computer and Spyder works fine, but running pylint through the Static Code Analysis feature gives an error. Pylint was installed through Conda. Note: Error in Spyder…

WTForms doesnt validate - no errors

I got a strange problem with the WTForms library. For tests I created a form with a single field:class ArticleForm(Form):content = TextField(Content)It receives a simple string as content and now I use…

NameError: global name numpy is not defined

I am trying to write a feature extractor by gathering essentias (a MIR library) functions. The flow chart is like: individual feature extraction, pool, PoolAggregator, concatenate to form the whole fea…

Django exclude from annotation count

I have following application:from django.db import modelsclass Worker(models.Model):name = models.CharField(max_length=60)def __str__(self):return self.nameclass Job(models.Model):worker = models.Forei…

How to use yield function in python

SyntaxError: yield outside function>>> for x in range(10): ... yield x*x ... File "<stdin>", line 2 SyntaxError: yield outside functionwhat should I do? when I try to use …

Tkinter looks different on different computers

My tkinter window looks very different on different computers (running on the same resolution!):windows 8windows 7I want it to look like it does in the first one. Any ideas?My code looks like this:cla…

Sort when values are None or empty strings python

I have a list with dictionaries in which I sort them on different values. Im doing it with these lines of code:def orderBy(self, col, dir, objlist):if dir == asc:sorted_objects = sorted(objlist, key=la…

How to tell if you have multiple Djangos installed

In the process of trying to install django, I had a series of failures. I followed many different tutorials online and ended up trying to install it several times. I think I may have installed it twice…

Python Numerical Integration for Volume of Region

For a program, I need an algorithm to very quickly compute the volume of a solid. This shape is specified by a function that, given a point P(x,y,z), returns 1 if P is a point of the solid and 0 if P i…

invalid literal for int() with base 10: on Python-Django

i am learning django from official django tutorial. and i am getting this error when vote something from form. this caused from - probably - vote function under views.py here is my views.py / vote func…