• [x] I have checked that this issue has not already been reported.

  • [x] I have confirmed this bug exists on the latest version of pandas.

  • [x] (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

# Your code here
import pandas as pd

df1 = pd.DataFrame({'x': [i for i in range(100)], 'y': [i for i in range(100)],
                    'z': [i for i in range(100)], 'd': [i for i in range(100)]})

df2 = df1.astype({"x":'category', "y":'category', "z":'category'})

df3 = df2.iloc[:10, :].groupby(['z', 'x'], observed=True).agg({'d': 'sum'})
df4 = df2.iloc[90:, :].groupby(['z', 'x', 'y'], observed=True).agg({'d': 'sum'})

pd.merge(df3, df4, left_index=True, right_index=True,
         how='outer').index.to_frame().dtypes

Problem description

I have 2 data frames, df3, and df4. Index of df3 is z, x and index of df4 is z, x, y. All x, y, z are categorical, and index x is of the same category-dtype in df3 as in df4, since df3 and df4 are created from groupby on the same data frame df2. For the same reason, index z is of the same category-dtype in df3 as in df4. So, I think after the merge, index x and z should stay as category. But they are not, this is what I get:

z int64 x int64 y category dtype: object

Expected Output

z category x category y category dtype: object

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : f2ca0a2665b2d169c97de87b8e778dbed86aea07 python : 3.8.5.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.18362 machine : AMD64 processor : Intel64 Family 6 Model 94 Stepping 3, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : English_United States.1252 pandas : 1.1.1 numpy : 1.19.0 pytz : 2020.1 dateutil : 2.8.1 pip : 20.2.3 setuptools : 47.1.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : 4.5.1 html5lib : None pymysql : None psycopg2 : None jinja2 : 2.11.2 IPython : 7.18.1 pandas_datareader: None bs4 : None bottleneck : None fsspec : None fastparquet : None gcsfs : None matplotlib : 3.2.2 numexpr : None odfpy : None openpyxl : 3.0.3 pandas_gbq : None pyarrow : None pytables : None pyxlsb : None s3fs : None scipy : 1.5.0 sqlalchemy : 1.3.18 tables : None tabulate : None xarray : None xlrd : None xlwt : None numba : None

Comment From: arw2019

This seems like a duplicate of #25412

Would you be interested in submitting a PR? Reading through #25412 there is interest in getting this to work but no one has got to it yet

Comment From: jbrockmendel

I'm seeing category-dtype results on main. Could use a test (or determine if one already exists)

Comment From: agodinez01

take