How can I make np.save work for an ndarray subclass?

2024/10/7 20:35:58

I want to be able to save my array subclass to a npy file, and recover the result later.

Something like:

>>> class MyArray(np.ndarray): pass
>>> data = MyArray(np.arange(10))
>>> np.save('fname', data)
>>> data2 = np.load('fname')
>>> assert isinstance(data2, MyArray)  # raises AssertionError

the docs says (emphasis mine):

The format explicitly does not need to:

  • [...]
  • Fully handle arbitrary subclasses of numpy.ndarray. Subclasses will beaccepted for writing, but only the array data will be written out. Aregular numpy.ndarray object will be created upon reading the file.The API can be used to build a format for a particular subclass, butthat is out of scope for the general NPY format.

So is it possible to make the above code not raise an AssertionError?

Answer

I don't see evidence that np.save handles array subclasses.

I tried to save a np.matrix with it, and got back a ndarray.

I tried to save a np.ma array, and got an error

NotImplementedError: MaskedArray.tofile() not implemented yet.

Saving is done by np.lib.npyio.format.write_array, which does

_write_array_header()   # save dtype, shape etc

if dtype is object it uses pickle.dump(array, fp ...)

otherwise it does array.tofile(fp). tofile handles writing the data buffer.

I think pickle.dump of an array ends up using np.save, but I don't recall how that's triggered.

I can for example pickle an array, and load it:

In [657]: f=open('test','wb')
In [658]: pickle.Pickler(f).dump(x)
In [659]: f.close()
In [660]: np.load('test')
In [664]: f=open('test','rb')
In [665]: pickle.load(f)

This pickle dump/load sequence works for test np.ma, np.matrix and sparse.coo_matrix cases. So that's probably the direction to explore for your own subclass.

Searching on numpy and pickle I found Preserve custom attributes when pickling subclass of numpy array. The answer involves a custom .__reduce__ and .__setstate__.

https://en.xdnf.cn/q/70201.html

Related Q&A

With ResNet50 the validation accuracy and loss is not changing

I am trying to do image recognition with ResNet50 in Python (keras). I tried to do the same task with VGG16, and I got some results like these (which seem okay to me): resultsVGG16 . The training and v…

string has incorrect type (expected str, got spacy.tokens.doc.Doc)

I have a dataframe:train_review = train[review] train_reviewIt looks like:0 With all this stuff going down at the moment w... 1 \The Classic War of the Worlds\" by Timothy Hi... 2 T…

custom URLs using django rest framework

I am trying to use the django rest framework to expose my models as APIs.serializersclass UserSerializer(serializers.HyperlinkedModelSerializer):class Meta:model = Userviewsetclass UserViewSet(viewsets…

Does python logging.FileHandler use block buffering by default?

The logging handler classes have a flush() method. And looking at the code, logging.FileHandler does not pass a specific buffering mode when calling open(). Therefore when you write to a log file, it …

Non brute force solution to Project Euler problem 25

Project Euler problem 25:The Fibonacci sequence is defined by the recurrence relation: Fn = Fn−1 + Fn−2, where F1 = 1 and F2 = 1. Hence the first 12 terms will be F1 = 1, F2 = 1, F3 = 2, F4 = 3, F5 =…

python:class attribute/variable inheritance with polymorphism?

In my endeavours as a python-apprentice i got recently stuck at some odd (from my point of view) behaviour if i tried to work with class attributes. Im not complaining, but would appreciate some helpfu…

Unable to load firefox in selenium webdriver in python

I have installed Python 3.6.2, Selenium 3.5.0 with GeckoDriver 0.18.0 and the firefox version is 54.0.1version on windows 7. I am trying to run a selenium script which is loading a firefox where i get …

Plot hyperplane Linear SVM python

I am trying to plot the hyperplane for the model I trained with LinearSVC and sklearn. Note that I am working with natural languages; before fitting the model I extracted features with CountVectorizer …

Calculating plugin dependencies

I have the need to create a plugin system that will have dependency support and Im not sure the best way to account for dependencies. The plugins will all be subclassed from a base class, each with i…

Vectorization: Not a valid collection

I wanna vectorize a txt file containing my training corpus for the OneClassSVM classifier. For that Im using CountVectorizer from the scikit-learn library. Heres below my code: def file_to_corpse(file…