Pandas Display sparse multi-index of dataframe in Pandas


In [1]: 
import pandas as pd
from io import StringIO 

data = """
code\tname\ttyp\tntf\n
A5411\tWD\tAF\t\n
A5411\tWD\tAF\t210194618\n
B5498\tSH\tNC\t\n
B5498\tSH\tNC\t210213014\n
"""
df = pd.read_table(StringIO(data))

In [2]: df.set_index(['name','code'])
Out[2]:
           typ          ntf
name code
WD   A5411  AF          NaN
     A5411  AF  210194618.0
SH   B5498  NC          NaN
     B5498  NC  210213014.0

Expected Output

I am expecting the output of In[2] should be something like Out[3]

In [3]: df.set_index(['name', 'code', 'typ'])
Out[3]:
                        ntf
name code  typ
WD   A5411 AF           NaN
           AF   210194618.0
SH   B5498 NC           NaN
           NC   210213014.0

Problem description

Not all the levels of groups are displayed in sparse style (only show one row for all same values). For out[2], the column code are also expected to be in that way

Output of `pd.show_versions()`

[paste the output of ``pd.show_versions()`` here below this line] INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 142 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None pandas: 0.22.0 pytest: None pip: 9.0.1 setuptools: 28.8.0 Cython: None numpy: 1.14.0 scipy: None pyarrow: None xarray: None IPython: 6.2.1 sphinx: None patsy: None dateutil: 2.6.1 pytz: 2017.3 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: None openpyxl: 2.5.0 xlrd: 1.1.0 xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 1.0.1 sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None

Comment From: jreback

the innermost level is not sparsified as its almost alway unique. generally having a non-unique MI is not performant and not recommend (though it does work).

In [11]: df.assign(r=[0,1, 0, 1]).set_index(['name', 'code', 'r'])
Out[11]: 
             typ          ntf
name code  r                 
WD   A5411 0  AF          NaN
           1  AF  210194618.0
SH   B5498 0  NC          NaN
           1  NC  210213014.0

Comment From: hastelloy

@jreback thanks, it does make sense...

Pandas Display sparse multi-index of dataframe in Pandas

Expected Output

Problem description

Output of pd.show_versions()

Output of `pd.show_versions()`