Removing columns which has only nan values from a NumPy array

2024/9/27 9:26:05

I have a NumPy matrix like the one below:

[[182 93 107 ..., nan nan -1][182 93 107 ..., nan nan -1][182 93 110 ..., nan nan -1]..., [188 95 112 ..., nan nan -1][188 97 115 ..., nan nan -1][188 95 112 ..., nan nan -1]]

I want to remove the columns which only involve nan values from the above matrix.

How can I do this? Thanks.

Answer

Assuming your array is of floats now, you can identify all the columns which are NaN and use fancy indexing to retrieve the others:

d
array([[ 182.,   93.,  107.,   nan,   nan,   -1.],[ 182.,   93.,  107.,    4.,   nan,   -1.],[ 182.,   93.,  110.,   nan,   nan,   -1.],[ 188.,   95.,  112.,   nan,   nan,   -1.],[ 188.,   97.,  115.,   nan,   nan,   -1.],[ 188.,   95.,  112.,   nan,   nan,   -1.]])d[:,~np.all(np.isnan(d), axis=0)]array([[ 182.,   93.,  107.,   nan,   -1.],[ 182.,   93.,  107.,    4.,   -1.],[ 182.,   93.,  110.,   nan,   -1.],[ 188.,   95.,  112.,   nan,   -1.],[ 188.,   97.,  115.,   nan,   -1.],[ 188.,   95.,  112.,   nan,   -1.]])
https://en.xdnf.cn/q/71463.html

Related Q&A

how to get kubectl configuration from azure aks with python?

I create a k8s deployment script with python, and to get the configuration from kubectl, I use the python command:from kubernetes import client, configconfig.load_kube_config()to get the azure aks conf…

Object vs. Dictionary: how to organise a data tree?

I am programming some kind of simulation with its data organised in a tree. The main object is World which holds a bunch of methods and a list of City objects. Each City object in turn has a bunch of m…

Fastest way to compute distance beetween each points in python

In my project I need to compute euclidian distance beetween each points stored in an array. The entry array is a 2D numpy array with 3 columns which are the coordinates(x,y,z) and each rows define a ne…

Calling C from Python: passing list of numpy pointers

I have a variable number of numpy arrays, which Id like to pass to a C function. I managed to pass each individual array (using <ndarray>.ctypes.data_as(c_void_p)), but the number of array may va…

Use of initialize in python multiprocessing worker pool

I was looking into the multiprocessing.Pool for workers, trying to initialize workers with some state. The pool can take a callable, initialize, but it isnt passed a reference to the initialized worker…

Pandas: select the first couple of rows in each group

I cant solve this simple problem and Im asking for help here... I have DataFrame as follows and I want to select the first two rows in each group of adf = pd.DataFrame({a:pd.Series([NewYork,NewYork,New…

Pandas: Approximate join on one column, exact match on other columns

I have two pandas dataframes I want to join/merge exactly on a number of columns (say 3) and approximately, i.e nearest neighbour, on one (date) column. I also want to return the difference (days) betw…

Adding a variable in Content disposition response file name-python/django

I am looking to add a a variable into the file name section of my below python code so that the downloaded files name will change based on a users input upon download. So instead of "Data.xlsx&quo…

TkInter: understanding unbind function

Does TkInter unbind function prevents the widget on which it is applied from binding further events to the widget ?Clarification:Lets say I bound events to a canvas earlier in a prgram:canvas.bind(&qu…

Dynamically get dict elements via getattr?

I want to dynamically query which objects from a class I would like to retrieve. getattr seems like what I want, and it performs fine for top-level objects in the class. However, Id like to also specif…