Code Sample, a copy-pastable example if possible
import pandas as pd
import numpy as np
index=list(range(8))
columns=pd.MultiIndex.from_product([['alpha', 'beta',], ['C','D'],[1,2]], names=['first', 'second','third'])
df1=pd.DataFrame(index=index,columns=columns,data=np.random.rand(8,8))
df2=pd.DataFrame(index=index,columns=columns,data=np.random.rand(8,8))
pd.merge(df1,df2,left_index=True,right_index=True,indicator=True)
This gives AttributeError: Can only use .cat accessor with a 'category' dtype
. If there are only two levels, the error happens in a different place: AttributeError: 'numpy.ndarray' object has no attribute 'start'
Expected Output
I think that this should put "_merge" in the first level and null in the other levels, but I don't know. Maybe it's ok to not have this functionality but just give a better error message (merge indicator not supported with multiindex columns).
output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 3.5.1.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 61 Stepping 4, GenuineIntel byteorder: little LC_ALL: None LANG: None
pandas: 0.17.1 nose: 1.3.7 pip: 8.1.0 setuptools: 20.2.2 Cython: 0.23.4 numpy: 1.10.4 scipy: 0.17.0 statsmodels: 0.6.1 IPython: 4.0.0 sphinx: 1.3.1 patsy: 0.4.0 dateutil: 2.4.2 pytz: 2016.4 blosc: None bottleneck: None tables: None numexpr: 2.4.4 matplotlib: 1.5.1 openpyxl: 2.2.6 xlrd: 0.9.4 xlwt: 1.0.0 xlsxwriter: None lxml: 3.4.4 bs4: None html5lib: None httplib2: None apiclient: None sqlalchemy: None pymysql: None psycopg2: None Jinja2: 2.8
Comment From: mroeschke
Thanks for the suggestion but given the lack of community or core team interest seems that there isn't much appetite for this. Closing