Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
>>> import pandas as pd
>>>
>>> np_idx = pd.Index([1, 0, 1], dtype="float64") # numpy dtype
>>> pd.Series(range(3), index=np_idx)[1] # ok
1.0 0
1.0 2
dtype: int64
>>> pd_idx = pd.Index([1, 0, 1], dtype="Float64") # pandas dtype
>>> pd.Series(range(3), index=pd_idx)[1] # not ok
1
Likewise with setting using indexing:
>>> ser = pd.Series(range(3), index=np_idx)
>>> ser[1] = 10
>>> ser # ok
1.0 10
0.0 1
1.0 10
dtype: int64
>>> ser = pd.Series(range(3), index=pd_idx)
>>> ser[1] = 10
>>> ser # not ok
1.0 0
0.0 10
1.0 2
dtype: int64
Issue Description
Indexing using Series.__getitem__
& Series.__setitem__
(likewise for DataFrame
) using integers on float indexes behaves differently, depending on if the the dtype is an extension float dtype or not.
The reason is that NumericIndex._should_fallback_to_positional
is always False
, while Index._should_fallback_to_positional
is only False
is the index is inferred to be integer-like (infer_dtype
returns "integer" or "mixed-integer").
The better solution would be for Index._should_fallback_to_positional
to be False
if its dtype is a real dtype (int, uint or float numpy or ExtensionDtype).
Expected Behavior
Indexing using a pandas float dtype should behave the same as for a numpy float dtype (except nan-related behaviour).
Installed Versions
Comment From: phofl
@jbrockmendel general thoughts here?
Comment From: jbrockmendel
The better solution would be for Index._should_fallback_to_positional to be Falseif its dtype is a real dtype (int, uint or float numpy or ExtensionDtype).
Agreed
Comment From: topper-123
The same issue is actually for complex dtypes also:
>>> 1 == complex(1) # a real number is a complex number with imaginary part = 0
True
>>> complex_idx = pd.Index([1, 0, 1], dtype="complex")
>>> ser = pd.Series(range(3), index=complex_idx)
>>> ser.index == 1
array([ True, False, True])
>>> ser[1]
1 # wrong I think?
Do we have tests for numeric index dtypes outside the numpy int/uint/float ones? It seems to me we don't?
Comment From: phofl
Yeah we do, series getitem with positional vs label based is a bit tricky though. There is actually an issue about deprecating it altogether #50617
Comment From: topper-123
Hm, I can't seem to find them. I would imagine somewhere similar to pandas/tests/indexes/numeric/
, but I can't see it. Could you point me to where they are?
Comment From: phofl
Indexing or series/indexing
DataFrame/indexing as well
Comment From: phofl
Indexes does not contain indexing tests (few exceptions)
Comment From: topper-123
Ok, thanks. I was thinking about testing the value in Index._should_fallbackto_positional
, but indexing tests are even better in this case.
Comment From: topper-123
I actually don’t see tests for indexes with numeric extension dtypes, only numpy mumeric dtypes, do you agree?
Comment From: jbrockmendel
support for indexing with the nullable numeric dtypes is pretty new, so it isn't surprising that the tests would not be fleshed out (same for complex).
In general, tests for the Index methods go in tests/indexes/whatever and tests for indexing, i.e. Series/DataFrame loc/iloc/__getitem__/__setitem__
go in tests/indexing/, tests/frame/indexing/, or tests/series/indexing/. There is certainly room for improvement organization-wise, so dont be shy if you see improvements to be made.
Comment From: topper-123
I've looked into those test locations and IMO they're not set-up for easy addition of new index types or dtypes. And AFAIKS indexing with numeric extension dtypes are tested very sparingly, probably because they're not easy to add tests for.
IMO the tests in pandas/tests/indexes/
are much better organized, because they're organized into a class hierarchy, e.g. TestFloatNumericIndex
inherits from NumericBase
, which inherits from Base
, i.e. just by subclassing Base
and add a _index_cls
attribute you get a lot of free tests already.
IMO it would be very beneficial to reorganize the tests/indexing
, tests/frame/indexing
& tests/series/indexing
modules to be structured more similarly to tests/indexes
where possible, but that's a bigger job. I'll try to add a test somewhere sensible for now.
Comment From: topper-123
I've updated the example to show the problem was also for __setitem__
.