Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
from decimal import Decimal
import pandas as pd
df = pd.DataFrame({"a": [1, 2, None, 4], "b": [1.5, None, 2.5, 3.5]})
selected_df = df.select_dtypes(include=["number", Decimal])
selected_df.fillna(0, inplace=True)
df.to_markdown()
Issue Description
On Pandas 1.3.4 and previously, mutating the DataFrame
returned by select_dtypes
didn't change the original DataFrame
. However, on 1.4.3, it appears mutating the selected DataFrame
changes the original one. I suspect this is the PR that has caused the change in behavior: #42611
On 1.4.3, the original DataFrame
is mutated:
a | b | |
---|---|---|
0 | 1 | 1.5 |
1 | 2 | 0 |
2 | 0 | 2.5 |
3 | 4 | 3.5 |
Expected Behavior
on 1.3.4, the original DataFrame
is unchanged:
a | b | |
---|---|---|
0 | 1 | 1.5 |
1 | 2 | nan |
2 | nan | 2.5 |
3 | 4 | 3.5 |
Installed Versions
Comment From: jorisvandenbossche
@villebro thanks for the report! That's indeed an unintended change in behaviour in 1.4 (and https://github.com/pandas-dev/pandas/pull/42611 seems indeed the likely cause). I opened a PR to fix this: https://github.com/pandas-dev/pandas/pull/48176
Comment From: villebro
@jorisvandenbossche thanks for looking into this issue and fixing it! FWIW, PR reviewed and approved 👍
Comment From: simonjayhawkins
That's indeed an unintended change in behaviour in 1.4 (and #42611 seems indeed the likely cause)
can confirm, from bisect, first bad commit: [37ea2cd5e932f8ab1b0ab0eb999f524809e83f38] PERF: select_dtypes (#42611)
Comment From: pickfire
I noticed this broke our code in 1.5 since we are depending on the select_dtypes
to return a reference to the original data frame in 1.4 as a feature, so that we can do .select_dtypes(exclude=['int8']).fillna('', inplace=True)
, I tried finding this but can't seemed to find it in the list of API breaking change.
https://pandas.pydata.org/pandas-docs/stable/whatsnew/v1.5.0.html#backwards-incompatible-api-changes
Can we also put there?