-
[x] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
- [ ] (optional) I have confirmed this bug exists on the master branch of pandas.
code
import pandas as pd
print(pd.__version__)
data = {
'groupby_col': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', ],
'agg_col': [1, 1, 0, 1, 0, 0, 0, 0, 1, 0],
}
df = pd.DataFrame(data)
print(df.groupby('groupby_col', group_keys=True).apply(lambda x: x.rolling(4).mean()))
print(df.groupby('groupby_col', group_keys=False).apply(lambda x: x.rolling(4).mean()))
output
1.2.3
agg_col
0 NaN
1 NaN
2 NaN
3 0.75
4 0.50
5 NaN
6 NaN
7 NaN
8 0.25
9 0.25
agg_col
0 NaN
1 NaN
2 NaN
3 0.75
4 0.50
5 NaN
6 NaN
7 NaN
8 0.25
9 0.25
The group keys should be add to index when setting group_keys=True(the default option). And when I use pandas==1.1.5
, the output is as expected:
1.1.5
agg_col
groupby_col
A 0 NaN
1 NaN
2 NaN
3 0.75
4 0.50
B 5 NaN
6 NaN
7 NaN
8 0.25
9 0.25
agg_col
0 NaN
1 NaN
2 NaN
3 0.75
4 0.50
5 NaN
6 NaN
7 NaN
8 0.25
9 0.25
But I find that examples in https://github.com/pandas-dev/pandas/issues/38787#issuecomment-759660512 does work as expected even in 1.2.3:
code
df = pd.DataFrame({"key": [1, 1, 1, 2, 2, 2, 3, 3, 3], "value": range(9)})
print(df.groupby("key", group_keys=True).apply(lambda x: x[:2]))
print(df.groupby("key", group_keys=False).apply(lambda x: x[:2]))
output
key value
key
1 0 1 0
1 1 1
2 3 2 3
4 2 4
3 6 3 6
7 3 7
key value
0 1 0
1 1 1
3 2 3
4 2 4
6 3 6
7 3 7
Comment From: kaisengit
Can confirm that this is still a problem with pandas 1.3.0
Comment From: rhshadrach
This was fixed in 1.5.0: https://pandas.pydata.org/pandas-docs/dev/whatsnew/v1.5.0.html#using-group-keys-with-transformers-in-dataframegroupby-apply-and-seriesgroupby-apply
Comment From: taozuoqiao
Thanks!
This was fixed in 1.5.0: https://pandas.pydata.org/pandas-docs/dev/whatsnew/v1.5.0.html#using-group-keys-with-transformers-in-dataframegroupby-apply-and-seriesgroupby-apply