Pandas version checks

  • [x] I have checked that this issue has not already been reported.

  • [x] I have confirmed this bug exists on the latest version of pandas.

  • [ ] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

# Some index that are not integers
index = pd.date_range(start='2000-01-01', periods=3, freq='YS')

# Integer as a name
data = pd.Series([1, 2, 3], index=index, name=2)

data.groupby(data) # FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`

Issue Description

The warning comes from this line: https://github.com/pandas-dev/pandas/blob/0691c5cf90477d3503834d983f69350f250a6ff7/pandas/core/groupby/grouper.py#L1015

Here, gpr.name is 2, which results in the warning. If the index consists of integers, this means obj[gpr.name] will actually return the line (which then fails the comparison).

I have not checked on the main branch, but the same line might be present here: https://github.com/pandas-dev/pandas/blob/d72f165eb327898b1597efe75ff8b54032c3ae7b/pandas/core/groupby/grouper.py#L853

Expected Behavior

Don't issue FutureWarning

Installed Versions

INSTALLED VERSIONS ------------------ commit : 0691c5cf90477d3503834d983f69350f250a6ff7 python : 3.13.0 python-bits : 64 OS : Windows OS-release : 11 Version : 10.0.22621 machine : AMD64 processor : Intel64 Family 6 Model 186 Stepping 3, GenuineIntel byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : English_United Kingdom.1252 pandas : 2.2.3 numpy : 2.2.2 pytz : 2025.1 dateutil : 2.9.0.post0 pip : 25.0 Cython : None sphinx : None IPython : 8.32.0 adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : None blosc : None bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : None html5lib : None hypothesis : None gcsfs : None jinja2 : None lxml.etree : None matplotlib : 3.10.0 numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None psycopg2 : None pymysql : None pyarrow : None pyreadstat : None pytest : 8.3.4 python-calamine : None pyxlsb : None s3fs : None scipy : 1.15.1 sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlsxwriter : None zstandard : 0.23.0 tzdata : 2025.1 qtpy : None pyqt5 : None

Comment From: rhshadrach

Thanks for the report. It appears to me we should only be doing this check when obj is a DataFrame. PRs to fix are welcome. I think this can go in 2.3.

Comment From: sanggon6107

take

Comment From: sanggon6107

Hi @rhshadrach, I've found that Series.groupby doesn't show the warning on the main branch(70edaa0) since Series.__getitem__ currently always treats the integer key as a label as discussed at #50617. Do we still need to do the check only when obj is a DataFrame? I personally think it might make sense though since Series doesn't have columns.

Comment From: rhshadrach

I've found that Series.groupby doesn't show the warning on the main branch

This is because that deprecation has been enforced on the main branch for the 3.0 release. However we will be releasing 2.3 prior to this, and the warning still shows on the 2.3.x branch. This is where the fix should go.

Comment From: Anurag-Varma

This can be closed right, as the pr is merged

Comment From: TiborVoelcker

I originally thought that the implementation on main might be even more dangerous: If grouping a pd.Series, and if gpr.name happens to be in the index of the series, one could get some really weird results. The assignment in https://github.com/pandas-dev/pandas/blob/d72f165eb327898b1597efe75ff8b54032c3ae7b/pandas/core/groupby/grouper.py#L853 would then be valid.

I have experimented a bit to get a bug (by making a series containing series, etc.), but I could never get https://github.com/pandas-dev/pandas/blob/d72f165eb327898b1597efe75ff8b54032c3ae7b/pandas/core/groupby/grouper.py#L857 to return True. Therefore, it seems to be fine. I was intentionally trying to break it anyway, and it would probably never happen naturally.

But the point I was trying to make might still be valid, just from a "cleaner code" standpoint: I believe an assumption is made in line 853 (that the obj is a pd.DataFrame), without checking it. So I just thought that this should be added, maybe even just by using obj_gpr_column = obj.loc[:, gpr.name].

I would gladly add this quick check, if wanted. But as I could not trigger a bug, I don't longer think this is strictly necessary.

So in short: Yes, I believe this issue can be closed. Thanks @sanggon6107 for the fast fix!

Comment From: rhshadrach

Thanks @Anurag-Varma - agreed. Closing.