Pandas version checks

  • [X] I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.groupby.SeriesGroupBy.transform.html

Documentation problem

>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
...                           'foo', 'bar'],
...                    'B' : ['one', 'one', 'two', 'three',
...                           'two', 'two'],
...                    'C' : [1, 5, 5, 2, 5, 5],
...                    'D' : [2.0, 5., 8., 1., 2., 9.]})
>>> grouped = df.groupby('A')[['C', 'D']]
>>> grouped.transform(lambda x: (x - x.mean()) / x.std())
          C         D
0 -1.154701 -0.577350
1  0.577350  0.000000
2  0.577350  1.154701
3 -1.154701 -1.000000
4  0.577350 -0.577350
5  0.577350  1.000000

This is the current example in SeriesGroupBy.tranform

Suggested fix for documentation

It should be changed with grouped: pd.Series=df.groupby('A')[['C', 'D']] just to make it clear it becomes hard for first timers

Comment From: bang128

@ramvikrams Do you mean we should change >>> grouped = df.groupby('A')[['C', 'D']] to grouped: pd.Series=df.groupby('A')[['C', 'D']]? But I think it should be something like grouped = pd.Series(df.groupby('A')[['C', 'D']])

Comment From: ramvikrams

Actually the doc is correct we just need to make it more explicit by the way i don't think grouped = pd.Series(df.groupby('A')[['C', 'D']]) this is the right syntax

Comment From: bang128

@ramvikrams So can I comment out the suggested fix? Something like: >>> grouped = df.groupby('A')[['C', 'D']] // grouped: pd.Series = df.groupby('A')[['C', 'D']]

Comment From: ramvikrams

Yes it seems right but I also just want to confirm from a member of pandas after that you can continue

Comment From: bang128

Thank you for your clarification!

Comment From: MarcoGorelli

grouped wouldn't be of type Series in this example though

In [6]: >>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
   ...: ...                           'foo', 'bar'],
   ...: ...                    'B' : ['one', 'one', 'two', 'three',
   ...: ...                           'two', 'two'],
   ...: ...                    'C' : [1, 5, 5, 2, 5, 5],
   ...: ...                    'D' : [2.0, 5., 8., 1., 2., 9.]})
   ...: >>> grouped = df.groupby('A')[['C', 'D']]

In [7]: grouped
Out[7]: <pandas.core.groupby.generic.DataFrameGroupBy object at 0x7f334eb0aa60>

Comment From: ramvikrams

grouped wouldn't be of type Series in this example though

yes I understood that but I feel this example needs clarity for as a beginner reading it

Comment From: bang128

I will take this issue

Comment From: ramvikrams

but I have a question that is the return type should be of Series or it is fine being DataFrame , because in stubs the return type is Series for SeriesGroupBy.transform()

Comment From: bang128

I've just looked at the documentation for DataFrame.groupby() and DataFram.transform(), I feel that grouped is actually unrelated to pd.Series in this case.

Comment From: MarcoGorelli

in that example, grouped is DataFrameGroupby. The docs would probably be improved by having an example with SeriesGroupby instead

Comment From: ramvikrams

yeah