Pandas pandas.DataFrame.head does not properly trim MultiIndexes

I'm trying to trim a very big DataFrame with a MultiIndex using head, and I noticed that the new DataFrame was still ridicolously big because it kept all the previous data from the entire MultiIndex. This doesn't happen if the DataFrame has a regular Index.

In [1]: from pandas import DataFrame

In [2]: df = DataFrame({'a': [1, 2, 3, 4, 5, 6], 'b': ['q', 'w', 'e', 'r', 't', 'y'], 'c': ['a', 's', 'd', 'f', 'g', 'h']}).set_index(['a', 'b'])

In [3]: df.head(3)
Out[3]: 
     c
a b   
1 q  a
2 w  s
3 e  d

In [4]: df.head(3).index
Out[4]: 
MultiIndex(levels=[[1, 2, 3, 4, 5, 6], [u'e', u'q', u'r', u't', u'w', u'y']],
           labels=[[0, 1, 2], [1, 4, 0]],
           names=[u'a', u'b'])

#### Expected Output
MultiIndex(levels=[[1, 2, 3], [u'e', u'q', u'w']],
           labels=[[0, 1, 2], [1, 2, 0]],
           names=[u'a', u'b'])

Output of `pd.show_versions()`

# Paste the output here ## INSTALLED VERSIONS commit: None python: 2.7.12.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-45-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 pandas: 0.18.1 nose: None pip: 8.1.2 setuptools: 25.1.1 Cython: None numpy: 1.11.1 scipy: 0.18.0 statsmodels: None xarray: None IPython: 5.0.0 sphinx: None patsy: None dateutil: 2.5.3 pytz: 2016.6.1 blosc: None bottleneck: None tables: None numexpr: None matplotlib: 1.5.1 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None httplib2: None apiclient: None sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.8 boto: None pandas_datareader: None

Comment From: jreback

duplicate of this: https://github.com/pandas-dev/pandas/issues/11724

indexing a multi-index does not delete unused levels on purpose. It is a bit expensive to reconstruct them and it is not clear when to do that. This is an implementation detail (e.g. peering into the MultiIndex).

Pandas pandas.DataFrame.head does not properly trim MultiIndexes

Output of pd.show_versions()

Output of `pd.show_versions()`