When I run this code on 0.17.0 on Python 3.5.0
import pandas as pd
d = pd.DataFrame(data=[[2, 1]])
print(d.index)
print((d.
groupby(level=0).
apply(lambda d:
d.sort_values(0)
)
).index)
I get
Int64Index([0], dtype='int64')
MultiIndex(levels=[[0], [0]],
labels=[[0], [0]])
The index gets duplicated (and turned into a multiindex) as a result of the sort
within the groupby
.
Appears to be a bug to me.
Comment From: jreback
this is the same issue as in #9946 (same because the same underlying cython routine is used)
you will notice that using lambda x: x.copy()
is the same result.
pandas is trying to infer whether you have mutated things (or not), by comparing indexes. This is prob a bug, but this is a bit of a rabbit hole as I think the best thing to do would actually to ban this entirely (mutation). So the outputs can be more predictible. You are welcome to dig-in!.
Comment From: rhshadrach
This can now be controlled by specifying group_keys=False
in groupby.