Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [ ] I have confirmed this bug exists on the [main branch] (https://pandas.pydata.org/docs/dev/getting_started/install.html#installing-the-development-version-of-pandas) of pandas.

Reproducible Example

import pandas as pd
import numpy as np
from pandas.core.dtypes import common

class MyFrame(pd.DataFrame): 
    def __init__(self, *args, **kwargs): 
        super().__init__(*args, **kwargs)
        for col in self.columns: 
            if not common.is_numeric_dtype(self[col]):
                self[col] = pd.to_numeric(self[col], errors='ignore')
    @property
    def _constructor(self): 
        return type(self)

def get_frame(N): 
    return MyFrame(
        data=np.vstack(
            [np.where(np.random.rand(N) > 0.36, np.random.rand(N), np.nan) for _ in range(10)]
        ).T, 
        columns=[f"col{i}" for i in range(10)]
    )

frame = get_frame(int(1e5))
frame.dropna(subset=["col0", "col1"])

Issue Description

Using a subclass setup that accesses the columns of the dataframe inside __init__ we are finding that pandas=1.5.x,1.4.x is much slower than than pandas=1.3.5/2.0.0.dev. This looks to be resolved in the most recent nightlies so, if not already tested, could some additional testing be done to ensure no further regressions in this area?

Timings by pandas version:

1.3.5 - 2ms

Pandas BUG: Performance regression in .dropna on DataFrame subclasses

1.4.4 - 1.84s

Pandas BUG: Performance regression in .dropna on DataFrame subclasses

1.5.2 - 2.2s

Pandas BUG: Performance regression in .dropna on DataFrame subclasses

2.0.0 (nightly build) - 1.6ms

Pandas BUG: Performance regression in .dropna on DataFrame subclasses

Expected Behavior

dropna runs at the same or faster than 1.3.5/2.0.0.dev speeds

Installed Versions

INSTALLED VERSIONS ------------------ commit : 66e3805b8cabe977f40c05259cc3fcf7ead5687d python : 3.10.8.final.0 python-bits : 64 OS : Darwin OS-release : 22.2.0 Version : Darwin Kernel Version 22.2.0: Fri Nov 11 02:03:51 PST 2022; root:xnu-8792.61.2~4/RELEASE_ARM64_T6000 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : en_CA.UTF-8 LOCALE : en_CA.UTF-8 pandas : 1.3.5 numpy : 1.23.5 pytz : 2022.7 dateutil : 2.8.2 pip : 22.3.1 setuptools : 59.8.0 Cython : 0.29.33 pytest : 6.2.5 hypothesis : 6.62.0 sphinx : 5.3.0 blosc : None feather : None xlsxwriter : None lxml.etree : 4.9.2 html5lib : None pymysql : None psycopg2 : 2.9.3 (dt dec pq3 ext lo64) jinja2 : 3.1.2 IPython : 8.8.0 pandas_datareader: None bs4 : 4.11.1 bottleneck : None fsspec : 2022.11.0 fastparquet : None gcsfs : None matplotlib : 3.5.3 numexpr : 2.8.3 odfpy : None openpyxl : 3.0.10 pandas_gbq : None pyarrow : 10.0.1 pyxlsb : None s3fs : None scipy : 1.10.0 sqlalchemy : 1.4.46 tables : 3.7.0 tabulate : None xarray : None xlrd : 2.0.1 xlwt : None numba : 0.56.4

Comment From: ryandvmartin

Looks like triage here: https://github.com/pandas-dev/pandas/issues/50708#issuecomment-1491748956 is the cause of this behavior, this looks like a duplicate