Categorical dtype changes after using melt

2024/4/15 2:20:10

In answering this question, I found that after using melt on a pandas dataframe, a column that was previously an ordered Categorical dtype becomes an object. Is this intended behaviour?

Note: not looking for a solution, just wondering if there is any reason for this behaviour or if it's not intended behavior.


Using the following dataframe df:

  Cat  L_1  L_2  L_3
0   A    1    2    3
1   B    4    5    6
2   C    7    8    9df['Cat'] = pd.Categorical(df['Cat'], categories = ['C','A','B'], ordered=True)# As you can see `Cat` is a category
>>> df.dtypes
Cat    category
L_1       int64
L_2       int64
L_3       int64
dtype: objectmelted = df.melt('Cat')>>> meltedCat variable  value
0   A      L_1      1
1   B      L_1      4
2   C      L_1      7
3   A      L_2      2
4   B      L_2      5
5   C      L_2      8
6   A      L_3      3
7   B      L_3      6
8   C      L_3      9

Now, if I look at Cat, it's become an object:

>>> melted.dtypes
Cat         object
variable    object
value        int64
dtype: object

Is this intended?


In source code . 0.22.0(My old version)

 for col in id_vars:mdata[col] = np.tile(frame.pop(col).values, K)mcolumns = id_vars + var_name + [value_name]

Which will return the datatype object with np.tile.

It has been fixed in 0.23.4(After I update my pandas)

Out[6]: Cat variable  value
0   A      L_1      1
1   B      L_1      4
2   C      L_1      7
3   A      L_2      2
4   B      L_2      5
5   C      L_2      8
6   A      L_3      3
7   B      L_3      6
8   C      L_3      9
Cat         category
variable      object
value          int64
dtype: object

More info how it fixed :

for col in id_vars:id_data = frame.pop(col)if is_extension_type(id_data): # here will return True , then become concat not np.tileid_data = concat([id_data] * K, ignore_index=True)else:id_data = np.tile(id_data.values, K)mdata[col] = id_data

