Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [ ] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
from typing import TypeVar, Generic

T = TypeVar("T")

class MyDataFrame(pd.DataFrame, Generic[T]):
    ...

def func() -> MyDataFrame[int]:
    return MyDataFrame[int]({"foo": [1, 2, 3]})

x = func()
print(x)

Issue Description

With pandas 1.5.1 and python 3.10, the code above gives no warning at runtime.

With pandas 1.5.1 and python 3.11, you get a warning from the typing library at runtime:

C:\Anaconda3\envs\pandasstubs311\Lib\typing.py:1253: UserWarning: Pandas doesn't allow columns to be created via a new attribute name - see https://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute-access
  result.__orig_class__ = self

We have testing code in pandas-stubs that uses this, because it was reported by the people developing the pandera library.

Expected Behavior

No warning.

We're doing something in pandas that doesn't allow a Generic subclass to be created with python 3.11.

Installed Versions

INSTALLED VERSIONS ------------------ commit : 91111fd99898d9dcaa6bf6bedb662db4108da6e6 python : 3.11.0.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.19043 machine : AMD64 processor : Intel64 Family 6 Model 158 Stepping 13, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : English_United States.1252 pandas : 1.5.1 numpy : 1.23.4 pytz : 2022.6 dateutil : 2.8.2 setuptools : 65.5.1 pip : 22.3.1 Cython : None pytest : 7.2.0 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : 1.1 pymysql : None psycopg2 : None jinja2 : 3.1.2 IPython : None pandas_datareader: None bs4 : None bottleneck : None brotli : None fastparquet : None fsspec : None gcsfs : None matplotlib : 3.6.2 numba : None numexpr : None odfpy : None openpyxl : 3.0.10 pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : 1.0.10 s3fs : None scipy : 1.9.3 snappy : None sqlalchemy : 1.4.43 tables : None tabulate : 0.9.0 xarray : 2022.11.0 xlrd : 2.0.1 xlwt : None zstandard : None tzdata : None

Comment From: Dr-Irv

Seems like is_list_like() is not handling a generic alias correctly:

import pandas as pd

from pandas._libs import lib
import sys

is_list_like = lib.is_list_like


from typing import TypeVar, Generic

T = TypeVar("T")


class Base:
    def __init__(self, x: int):
        self._x = x


class Gen(Base, Generic[T]):
    ...


fooc = Gen[float]
print(type(fooc))
print(sys.version)
print("pandas version ", pd.__version__)
print(is_list_like(fooc))

With python 3.10, output is

<class 'typing._GenericAlias'>
3.10.4 | packaged by conda-forge | (main, Mar 30 2022, 08:38:02) [MSC v.1916 64 bit (AMD64)]
pandas version  1.5.1
False

With python 3.11, output is

<class 'typing._GenericAlias'>
3.11.0 | packaged by conda-forge | (main, Oct 25 2022, 06:12:32) [MSC v.1929 64 bit (AMD64)]
pandas version  1.5.1
True

Comment From: sagitta42

Hi everyone, I'm getting a warning UserWarning: Pandas doesn't allow columns to be created via a new attribute name when assigning a dictionary as an attribute to a class inheriting from DataFrame. I have not found this issue being reported.

Version checks: python version 3.10.7, pandas 1.5.3

Example:

import pandas as pd

class MyClass(pd.DataFrame):
    def __init__(self, dct_frame, dct_settings):
        pd.DataFrame.__init__(self, dct_frame)
        self.settings = dct_settings


mc = MyClass({'a': [1,2], 'b': [3,4]}, {'plot_style': 'param_per_chanenl'})

Warning message:

UserWarning: Pandas doesn't allow columns to be created via a new attribute name - see https://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute-access
  self.settings = dct_settings

Is there a way to combat this? Or should I find another way of structuring my class to avoid this? I'm an amateur and don't know how to structure a comment in an issue thread properly, but didn't know where else to ask, as I'm not finding discussions on this, so I want to apologize for bothering.

Comment From: Dr-Irv

Because pandas allows referring to columns as an attribute, you can't add attributes to a subclass. You can store your additional attributes in Dataframe.attrs

The following should work:

import pandas as pd

class MyClass(pd.DataFrame):
    def __init__(self, dct_frame, dct_settings):
        pd.DataFrame.__init__(self, dct_frame)
        self.attrs["settings"] = dct_settings

Comment From: sagitta42

Thanks for the lightning fast response! It did not flash the warning when creating string attributes in the subclass, but did for a dict attribute. I will follow your suggestion.