from collections import OrderedDict
import random
import pandas as pd
pd.__version__
'0.20.3'
df = pd.DataFrame.from_dict(OrderedDict(
[
((lv1, lv2, lv3), {'KG': random.randint(0, 100)})
for lv1 in ['A', 'B']
for lv2 in ['M', 'N', 'O']
for lv3 in ['X', 'Y', 'Z']
]
))
df
A | B | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
M | N | O | M | N | O | |||||||||||||
X | Y | Z | X | Y | Z | X | Y | Z | X | Y | Z | X | Y | Z | X | Y | Z | |
KG | 16 | 71 | 27 | 8 | 83 | 0 | 100 | 72 | 100 | 42 | 26 | 21 | 91 | 5 | 36 | 98 | 96 | 52 |
df.loc[:, idx[:, :, 'X']]
A | B | |||||
---|---|---|---|---|---|---|
M | N | O | M | N | O | |
X | X | X | X | X | X | |
KG | 44 | 83 | -28 | 5 | -31 | 44 |
df.xs('Y', level=2, axis=1) - df.xs('Z', level=2, axis=1)
A | B | |||||
---|---|---|---|---|---|---|
M | N | O | M | N | O | |
KG | 44 | 83 | -28 | 5 | -31 | 44 |
idx = pd.IndexSlice
df.loc[:, idx[:, :, 'X']] = (df.xs('Y', level=2, axis=1) - df.xs('Z', level=2, axis=1))
df
A | B | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
M | N | O | M | N | O | |||||||||||||
X | Y | Z | X | Y | Z | X | Y | Z | X | Y | Z | X | Y | Z | X | Y | Z | |
KG | NaN | 71 | 27 | NaN | 83 | 0 | NaN | 72 | 100 | NaN | 26 | 21 | NaN | 5 | 36 | NaN | 96 | 52 |
This was unexpected. I realize that the indexes do not match, and hence this behavior, but I expect Pandas to throw an Exception, rather than silently do this
idx = pd.IndexSlice
df.loc[:, idx[:, :, 'X']] = (df.xs('Y', level=2, axis=1) - df.xs('Z', level=2, axis=1)).values
df
A | B | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
M | N | O | M | N | O | |||||||||||||
X | Y | Z | X | Y | Z | X | Y | Z | X | Y | Z | X | Y | Z | X | Y | Z | |
KG | 44 | 71 | 27 | 83 | 83 | 0 | -28 | 72 | 100 | 5 | 26 | 21 | -31 | 5 | 36 | 44 | 96 | 52 |
This was, of course, the behavior I was expecting
Comment From: gfyoung
@kghosesbg : Thanks for reporting this! Looks quite odd to me indeed. This is reproducible on master
, so feel free to investigate and see what's going on here.
Comment From: jreback
This is expected, you are assigning mismatched levels. It is pretty tricky to actually figure out what you are doing is invalid. Sure you can make an assumption that you are assigning based on sub-levels, or that the levels don't match on the left and right sides, but this is pretty fragile. xref https://github.com/pandas-dev/pandas/issues/12343, https://github.com/pandas-dev/pandas/issues/7475, duplicate of https://github.com/pandas-dev/pandas/issues/10440
In [13]: df.loc[:, idx[:, :, 'X']] = (df.xs('Y', level=2, axis=1) - df.xs('Z', level=2, axis=1))
In [14]: df.loc[:, idx[:, :, 'X']]
Out[14]:
A B
M N O M N O
X X X X X X
KG NaN NaN NaN NaN NaN NaN
This is the correct method, essentially no alignment and a direct assignment.
In [16]: df.loc[:, idx[:, :, 'X']] = (df.xs('Y', level=2, axis=1) - df.xs('Z', level=2, axis=1)).values
In [17]: df
Out[17]:
A B
M N O M N O
X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z
KG 79 98 19 4 34 30 -52 34 86 29 70 41 -66 14 80 -58 42 100
This is very simliar in concept to the warning at the end of this section. If you wanted to make an example for MultiIndexing would be great.
http://pandas.pydata.org/pandas-docs/stable/indexing.html#basics
Comment From: jreback
@kghosesbg we need someone to collect the cases for MultiIndex and write/re-write the logic a bit. I am not opposed to raising in a case like this, but this requires some investigation. Would be great just to have the cases listed (and we can xfail them). Then tackle later. would you be interested in this?
Comment From: ghost
Hi @jreback I'm happy to help as needed, though it might go slowly. Could you please clarify the task. From you previous comment I understand that you'd like to add this example to the multi-indexing docs, which I can do. Is the next to collect reports of these cases and put them in the test suite?
Comment From: ghost
@jreback I'll probably also do this with my mild mannered private individual role which I'm tagging @kghose
Comment From: jreback
yep those are the tasks