ValueError: cannot reindex from a duplicate axis in groupby Pandas

2024/9/22 11:41:19

My dataframe looks like this:

SKU #    GRP    CATG   PRD
0   54995  9404000  4040  99999
1   54999  9404000  4040  99999
2   55037  9404000  4040  1556894
3   55148  9404000  4040  1556894
4   55254  9404000  4040  1556894
5   55291  9404000  4040  1556894
6   55294  9404000  4040  1556895
7   55445  9404000  4040  1556895
8   55807  9404001  4040  1556896
9   49021  9404002  4040  1556897
10  49035  9404002  4040  1556897
11  27538  9404000  4040  1556898
12  27539  9404000  4040  1556899
13  27540  9404000  4040  1556894
14  27542  9404000  4040  1556900
15  27543  9404000  4040  1556900
16  27544  9404003  4040  1556901
17  27546  9404004  4040  1556902
18  99111  9404005  4040  1556903
19  99112  9404006  4040  1556904
20  99113  9404007  4040  1556905
21  99116  9404008  4040  1556906
22  99119  9404009  4040  1556907
23  99122  94040010 4040  1556908
24  99125  94040011 4040  1556909
25  86007  94040012 4040  1556910
26  86010  94040013 4040  1556911 

And when I try to perform a group by operation on the above dataframe, I get the "cannot reindex from a duplicate axis" error.

df.groupby(['GRP','CATG'],as_index=False)['PRD'].min()

I tried to find out the duplicate indices using:

df[df.index.duplicated()]

But didn't return any thing. How can I go about resolving this issue?

Answer

This error is often thrown due to duplications in your column names (not necessarily values)

First, just check if there is any duplication in your column names using the code: df.columns.duplicated().any()

If it's true, then remove the duplicated columns

df.loc[:,~df.columns.duplicated()]

After you remove the duplicated columns, you should be able to run your groupby operation.

https://en.xdnf.cn/q/71950.html

Related Q&A

How to calculate class weights of a Pandas DataFrame for Keras?

Im tryingprint(Y) print(Y.shape)class_weights = compute_class_weight(balanced,np.unique(Y),Y) print(class_weights)But this gives me an error:ValueError: classes should include all valid labels that can…

How to change the layout of a Gtk application on fullscreen?

Im developing another image viewer using Python and Gtk and the viewer is currently very simple: it consists of a GtkWindow with a GtkTreeView on the left side displaying the list of my images, and a G…

How to upload multiple file in django admin models

file = models.FileField(upload_to=settings.FILE_PATH)For uploading a file in django models I used the above line. But For uploading multiple file through django admin model what should I do? I found t…

Convert numpy array to list of datetimes

I have a 2D array of dates of the form:[Y Y Y ... ] [M M M ... ] [D D D ... ] [H H H ... ] [M M M ... ] [S S S ... ]So it looks likedata = np.array([[2015, 2015, 2015, 2015, 2015, 2015], # ...[ 1, …

PyQt: how to handle event without inheritance

How can I handle mouse event without a inheritance, the usecase can be described as follows:Suppose that I wanna let the QLabel object to handel MouseMoveEvent, the way in the tutorial often goes in th…

DHT22 Sensor import Adafruit_DHT error

So Ive properly attached DHT22 Humidity Sensor to my BeagleBone Black Rev C. Im running OS Mavericks on my MacBook Pro and I followed the directions provided by Adafruit on how to use my DHT22 The webs…

Whats the purpose of package.egg-info folder?

Im developing a python package foo. My project structure looks like this:. ├── foo │ ├── foo │ │ ├── bar.py │ │ ├── foo.py │ │ ├── __init__.py │ ├── README.md …

Implement Causal CNN in Keras for multivariate time-series prediction

This question is a followup to my previous question here: Multi-feature causal CNN - Keras implementation, however, there are numerous things that are unclear to me that I think it warrants a new quest…

How to decode a numpy array of dtype=numpy.string_?

I need to decode, with Python 3, a string that was encoded the following way:>>> s = numpy.asarray(numpy.string_("hello\nworld")) >>> s array(bhello\nworld, dtype=|S11)I tri…

Cosine similarity of word2vec more than 1

I used a word2vec algorithm of spark to compute documents vector of a text. I then used the findSynonyms function of the model object to get synonyms of few words. I see something like this: w2vmodel.f…