I want to be able to save my array subclass to a npy file, and recover the result later.
Something like:
>>> class MyArray(np.ndarray): pass
>>> data = MyArray(np.arange(10))
>>> np.save('fname', data)
>>> data2 = np.load('fname')
>>> assert isinstance(data2, MyArray) # raises AssertionError
the docs says (emphasis mine):
The format explicitly does not need to:
- [...]
- Fully handle arbitrary subclasses of numpy.ndarray. Subclasses will beaccepted for writing, but only the array data will be written out. Aregular numpy.ndarray object will be created upon reading the file.The API can be used to build a format for a particular subclass, butthat is out of scope for the general NPY format.
So is it possible to make the above code not raise an AssertionError?
I don't see evidence that np.save
handles array subclasses.
I tried to save a np.matrix
with it, and got back a ndarray
.
I tried to save a np.ma
array, and got an error
NotImplementedError: MaskedArray.tofile() not implemented yet.
Saving is done by np.lib.npyio.format.write_array
, which does
_write_array_header() # save dtype, shape etc
if dtype
is object it uses pickle.dump(array, fp ...)
otherwise it does array.tofile(fp)
. tofile
handles writing the data buffer.
I think pickle.dump
of an array ends up using np.save
, but I don't recall how that's triggered.
I can for example pickle
an array, and load it:
In [657]: f=open('test','wb')
In [658]: pickle.Pickler(f).dump(x)
In [659]: f.close()
In [660]: np.load('test')
In [664]: f=open('test','rb')
In [665]: pickle.load(f)
This pickle
dump/load sequence works for test np.ma
, np.matrix
and sparse.coo_matrix
cases. So that's probably the direction to explore for your own subclass.
Searching on numpy
and pickle
I found Preserve custom attributes when pickling subclass of numpy array. The answer involves a custom .__reduce__
and .__setstate__
.