-
[x] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
-
[ ] (optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
import pandas as pd
s = pd.Series([1, 2, 3])
s1 = s.iloc[1:]
s1 -= 4
s
# output
# 0 1
# 1 -2
# 2 -1
# dtype: int64
s = pd.Series([1, 2, 3])
s1 = s.iloc[1:]
s1 = s1 - 4
s
# output
# 0 1
# 1 2
# 2 3
# dtype: int64
Problem description
Starting with version 1.1.4
when applying either of +=
, -=
, *=
or /=
operators on a series s1
that was sliced from original series s
, the change propagates to original series s
. When using normal assignment operators s1 = s1 - 4
the problem is not present, which leads to inconsistent behavior.
Output of pd.show_versions()
Comment From: simonjayhawkins
Thanks @nikita-ivanov for the report.
Starting with version
1.1.4
when applying either of+=
,-=
,*=
or/=
operators on a seriess1
that was sliced from original seriess
, the change propagates to original seriess
.
This was changed in [d8c8cbb90544b77e5da1daa7274eba7bef3257b9] REGR: inplace Series op not actually operating inplace (#37508) to fix a regression for DataFrame behaviour.
It looks like is also corrected a long standing bug/inconsistency with the series behaviour.
cc @jbrockmendel
Comment From: nikita-ivanov
@simonjayhawkins thanks for the update. Do I see it correctly that the future behavior would the same as it is now for -=
(and similar operations)? More precisely,
s = pd.Series([1, 2, 3])
s1 = s.iloc[1:]
s1 -= 4
s
# output
# 0 1
# 1 -2
# 2 -1
# dtype: int64
s = pd.Series([1, 2, 3])
s1 = s.iloc[1:]
s1 = s1 - 4
s
# output
# 0 1
# 1 -2
# 2 -1
# dtype: int64
Comment From: simonjayhawkins
for the s1 = s1 - 4
case, the expected output remains as
# 0 1
# 1 2
# 2 3
# dtype: int64
When using normal assignment operators
s1 = s1 - 4
the problem is not present, which leads to inconsistent behavior.
in Python, the a = a+1
is not always equivalent to a+=1
.
Under the hood, Python calls 'special methods' for the various syntactical constructs. in this case __add__
and __iadd__
if the object, in this case a
, implements __iadd__
, that will be called. In the case of mutable sequences, a
will change in place. However, when a
does not implement __iadd__
, the expression a+=1
has the same effect as
a = a+1
. The expression a+1
will be evaluated first, producing a new object, which is then bound to a
. In other words, the object bound to a
may or may not change, depending on whether a
implements __iadd__
In general, for mutable sequences, it is common for __iadd__
to be implemented and that +=
happens inplace. For immutable sequences, clearly there is no way for that to happen.
A pandas Series is a mutable sequence that implements __iadd__
Comment From: nikita-ivanov
@simonjayhawkins thanks for clarifications. My concern is that prior to 1.1.4
the behavior was the same for both cases. And I did not find change of behavior information in the release notes. However, this change can cause some bugs (the case for me). I shall switch to s1 = s1 - 4
, but it might be worthwhile to mention new behavior in the release notes.
Comment From: simonjayhawkins
@nikita-ivanov I agree that a long standing behaviour has been changed without adequate notice.
We will await comment from @jbrockmendel for the way forward here.
I am reluctant to add a regression tag here as I think the new behaviour is more correct and we are not currently planning any more releases in the 1.1.x series.
Comment From: jbrockmendel
I am reluctant to add a regression tag here as I think the new behaviour is more correct
Agreed.
Comment From: MarcoGorelli
I am reluctant to add a regression tag here as I think the new behaviour is more correct
Agreed.
So, should the release notes be clarified, or can this be closed? @jbrockmendel
Comment From: jbrockmendel
If someone wants to flesh out a release note i wont object
Comment From: SophiaTangHart
I'd like to take it and clarify on the release note. I'd like to let you know that this is my first time contributing.
Comment From: MarcoGorelli
Sure, go ahead - here's the contributing guide https://pandas.pydata.org/docs/dev/development/contributing_docstring.html , please do ask (here or on Slack) if anything's not clear
Comment From: SophiaTangHart
@MarcoGorelli, Thank you. I'm reading the contributing guide. I'm also new to Slack. How do I join Pandas Slack group? Do I refer to version 1.1.4 #38519 if I need clarification? Do I need to address certain people? Thanks!
Comment From: MarcoGorelli
Hey - for Slack: https://pandas.pydata.org/docs/dev/development/community.html?highlight=slack#community-slack
I'd suggest just rewording
https://github.com/pandas-dev/pandas/blob/ac648eeaf5c27ab957e8cd284eb7e49a45232f00/doc/source/whatsnew/v1.1.4.rst#L34
, e.g. to
Fixed regression in inplace arithmetic operation (e.g. `+=`) on a Series not updating the parent DataFrame/Series
Comment From: SophiaTangHart
@MarcoGorelli, thank you for directing me to join Slack. And thank you for your suggestion.
I'm not able to open
- Fixed regression in inplace arithmetic operation on a Series not updating the parent DataFrame (:issue:
36373
)
to reword it. How can I access the docstring? Do I need to do it through my GitHut account? Thank you.
Comment From: MarcoGorelli
Do I need to do it through my GitHut account?
Yeah you'll need to fork the repo, clone it, checkout a new branch, stage, commit, push, open a pull request - if you read through https://pandas.pydata.org/docs/dev/development/contributing.html it should all be explained, else we're available on Slack to help out
Comment From: SophiaTangHart
take
Comment From: kathleenhang
@SophiaTangHart Hi there, do you mind if I take the issue since it seems to be inactive for some time?
Comment From: SophiaTangHart
Sure. Sorry for taking a while. I've unassigned myself. Thanks.
Comment From: kathleenhang
take