Code Sample, a copy-pastable example if possible
I have not found an easy way to access a column of a multi index by name. Consider the following code:
import pandas as pd
df = pd.DataFrame([[1,1,10,200],
[1,2,11,201],
[1,3,12,202],
[2,1,13,210],
[2,2,14,230]],
columns=list('ABCD')).set_index(['A','B'])
ii = df.index
To access the values of column B i can e.g. do:
df.reset_index('B').B.tolist()
or
[v[1] for v in ii.tolist()]
Both of there are somewhat cumbersome What I'm proposing is to add similar shorcut access to a multiindex just like a dataframe. E.g.
ii.B
or
ii['B']
that would return the same list as in the two examples above.
output of pd.show_versions()
0.18.1
Comment From: jorisvandenbossche
Related issue / possibly duplicate: #10816
Comment From: jreback
this is the idiom, which is not bad actually.
In [3]: df.index.get_level_values('B')
Out[3]: Int64Index([1, 2, 3, 1, 2], dtype='int64', name=u'B')
Returning a list is non-performant / not-idiomatic.
Comment From: dov
Nice! I missed that idiom.
But, still, is there any reason that:
df.index.B
is not a shortcut for df.index.get_level_values('B')
? What makes the dataframe special?
Comment From: jreback
So doing df.index.B
looks simple and generally it is, however there are some edge cases that make this untenable:
df.index['B']
needs to work similarly for a smooth user xp, however this is in direct conflict with positional access on a single level for a regular Index
and on a MI:
In [7]: df.index[0]
Out[7]: (1, 1)
In [8]: df.index[[True]*len(df.index)]
Out[8]:
MultiIndex(levels=[[1, 2], [1, 2, 3]],
labels=[[0, 0, 0, 1, 1], [0, 1, 2, 0, 1]],
names=[u'A', u'B'])
You might think that is ok, until you realize that 0 also could mean level=0. Then what do you do?