Pandas version checks
- [X] I have checked that the issue still exists on the latest versions of the docs on
main
here
Location of the documentation
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.groupby.SeriesGroupBy.transform.html
Documentation problem
>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
... 'foo', 'bar'],
... 'B' : ['one', 'one', 'two', 'three',
... 'two', 'two'],
... 'C' : [1, 5, 5, 2, 5, 5],
... 'D' : [2.0, 5., 8., 1., 2., 9.]})
>>> grouped = df.groupby('A')[['C', 'D']]
>>> grouped.transform(lambda x: (x - x.mean()) / x.std())
C D
0 -1.154701 -0.577350
1 0.577350 0.000000
2 0.577350 1.154701
3 -1.154701 -1.000000
4 0.577350 -0.577350
5 0.577350 1.000000
This is the current example in SeriesGroupBy.tranform
Suggested fix for documentation
It should be changed with
grouped: pd.Series=df.groupby('A')[['C', 'D']]
just to make it clear it becomes hard for first timers
Comment From: bang128
@ramvikrams Do you mean we should change >>> grouped = df.groupby('A')[['C', 'D']]
to grouped: pd.Series=df.groupby('A')[['C', 'D']]
?
But I think it should be something like grouped = pd.Series(df.groupby('A')[['C', 'D']])
Comment From: ramvikrams
Actually the doc is correct we just need to make it more explicit
by the way i don't think grouped = pd.Series(df.groupby('A')[['C', 'D']])
this is the right syntax
Comment From: bang128
@ramvikrams So can I comment out the suggested fix?
Something like: >>> grouped = df.groupby('A')[['C', 'D']] // grouped: pd.Series = df.groupby('A')[['C', 'D']]
Comment From: ramvikrams
Yes it seems right but I also just want to confirm from a member of pandas after that you can continue
Comment From: bang128
Thank you for your clarification!
Comment From: MarcoGorelli
grouped
wouldn't be of type Series
in this example though
In [6]: >>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
...: ... 'foo', 'bar'],
...: ... 'B' : ['one', 'one', 'two', 'three',
...: ... 'two', 'two'],
...: ... 'C' : [1, 5, 5, 2, 5, 5],
...: ... 'D' : [2.0, 5., 8., 1., 2., 9.]})
...: >>> grouped = df.groupby('A')[['C', 'D']]
In [7]: grouped
Out[7]: <pandas.core.groupby.generic.DataFrameGroupBy object at 0x7f334eb0aa60>
Comment From: ramvikrams
grouped
wouldn't be of typeSeries
in this example though
yes I understood that but I feel this example needs clarity for as a beginner reading it
Comment From: bang128
I will take this issue
Comment From: ramvikrams
but I have a question that is the return type should be of Series
or it is fine being DataFrame
, because in stubs the return type is Series
for SeriesGroupBy.transform()
Comment From: bang128
I've just looked at the documentation for DataFrame.groupby()
and DataFram.transform()
, I feel that grouped
is actually unrelated to pd.Series
in this case.
Comment From: MarcoGorelli
in that example, grouped
is DataFrameGroupby
. The docs would probably be improved by having an example with SeriesGroupby
instead
Comment From: ramvikrams
yeah