I noticed that .loc
and __setitem__
behave very differently when assigning one series to a sub-range of another series:
>>> s = pd.Series(0.0, index=list('abcd'))
>>> s1 = pd.Series(1.0, index=list('ab'))
>>> s2 = pd.Series(2.0, index=list('xy'))
>>> s[['a', 'b']] = s2
>>> s # names of s2 are ignored as expected
a 2.0
b 2.0
c 0.0
d 0.0
dtype: float64
>>> s.loc[['a', 'b']] = s2
>>> s # not expected!!
a NaN
b NaN
c 0.0
d 0.0
dtype: float64
>>> s.loc[['a', 'b']] = s1
>>> s # everything's fine if the indices match
a 1.0
b 1.0
c 0.0
d 0.0
dtype: float64
I'm not sure if this is intended behaviour but it seems odd.
I'm on pandas v. 0.24.1
Comment From: WillAyd
Not sure I agree on expectation but this is rather nuanced. I think this should be raising a SettingWithCopyWarning
for the first sample - @TomAugspurger maybe thoughts on your end?
Comment From: TomAugspurger
I'm not sure what the rules are for setitem. It seems like labels are ignored when the lengths are the same?
In [48]: s3 = pd.Series([1, 2], index=['a', 'b'])
In [49]: target = s.copy()
In [50]: target[['a', 'b']] = s3; target
Out[50]:
a 1.0
b 2.0
c 0.0
d 0.0
dtype: float64
In [51]: target = s.copy()
In [52]: target[['a', 'b']] = s3[['b', 'a']]; target
Out[52]:
a 2.0
b 1.0
c 0.0
d 0.0
dtype: float64
But differing lengths triggers an alignment (output 2 and 3; though 3 is already aligned)?
I wouldn't expect a SettingWithCopyWarning on the first one. The target isn't a (maybe) copy of another object. This is all in a single call to __setitem__
so it's fine (as opposed to x = s[['a', 'b']]; x = s2
)
Comment From: phofl
This seems to be consitent now and returns
a NaN
b NaN
c 0.0
d 0.0
dtype: float64
a NaN
b NaN
c 0.0
d 0.0
dtype: float64
a 1.0
b 1.0
c 0.0
d 0.0
dtype: float64
Is this the expected output now?
Comment From: phofl
This is expected
Comment From: srkds
I noticed that
.loc
and__setitem__
behave very differently when assigning one series to a sub-range of another series:```python
s = pd.Series(0.0, index=list('abcd')) s1 = pd.Series(1.0, index=list('ab')) s2 = pd.Series(2.0, index=list('xy')) s[['a', 'b']] = s2 s # names of s2 are ignored as expected a 2.0 b 2.0 c 0.0 d 0.0 dtype: float64 s.loc[['a', 'b']] = s2 s # not expected!! a NaN b NaN c 0.0 d 0.0 dtype: float64 s.loc[['a', 'b']] = s1 s # everything's fine if the indices match a 1.0 b 1.0 c 0.0 d 0.0 dtype: float64 ```
I'm not sure if this is intended behaviour but it seems odd.
I'm on pandas v. 0.24.1
I tried executing the same example and got the same result. pandas version == 2.0.0
>>> s.loc[['a', 'b']] = s2
> >>> s # This should be the expected output or it works as intended (o/p with NaN one)?
> a 2.0
> b 2.0
> c 0.0
> d 0.0
Comment From: DhruvBShetty
take
Comment From: DhruvBShetty
The behaviour of .loc[[]] with other dtypes and boolean dtypes in Index is different.
import pandas as pd
series1=pd.Series(['a','b','c','d'],index=[1,2,3,4])
series2=pd.Series(['a','b','c','d'],index=[True,True,True,False])
print(series1.loc[[2]])
Out[1]: 2 b
dtype: object
print(series2.loc[[False,True,False,False]])
Out[2]: True b
dtype: object
This is expected behaviour
Should we have tests for say series2.loc[[]] for boolean indices being assigned with series3 that is another series with boolean indices? Example below
series3=pd.Series(['e','f'],index=[True,False])
series2.loc[[True,False,False,True]]=series3
Out[3]:True e
True b
True c
False f
dtype: object
And if it's included, should it have one test or 2 separate tests( 1 for other dtypes in index, 1 for boolean dtypes in index). I wrote the test already using the series_with_simple_index fixture and required an if condition to handle boolean types.
Comment From: palbha
@rhshadrach Seems like the above PR is merged -> https://github.com/pandas-dev/pandas/pull/60450 Do you think we can mark this closed?
Comment From: rhshadrach
Thanks @palbha - agreed this can be closed.