Pandas Q: DataFrame loc and iloc seem to have inconsistent negative indexing behaviors.

With version: 0.20.3, DataFrame loc and iloc have inconsistent and buggy indexing behaviors.

df = pd.DataFrame([dict(idx=idx) for idx in range(10)])
print(df.loc[range(3) + range(-3, 0), 'idx'])

returns NaN for negative indices

 0    0.0
 1    1.0
 2    2.0
-3    NaN
-2    NaN
-1    NaN

(also note that somehow the int became float...)

whereas

print(df.iloc[range(3) + range(-3, 0)])

returns the last raws

Additionally, loc fails if only negative indices are passed: df.loc[[-2, -1], 'idx'] but not if both positive and negative df.loc[[0, -1], 'idx']

Comment From: jorisvandenbossche

@kingjr Have a look at the indexing docs (http://pandas.pydata.org/pandas-docs/stable/indexing.html#different-choices-for-indexing) on the different options to index (see also that link more below under "Selection by label" and "Selection by position")

The behaviours of loc and iloc are different on purpose because they serve different goals:

loc is label based: the negative values are not present in the index labels, and hence you get missing values for that (loc returns a result once there is at least one existing label present, in the case of df.loc[[-2, -1], 'idx'] no existing label is present and therefore it raises)
iloc is position based: negative indices here mean 'start to count from the end', and therefore the shown result is perfectly as expected

(also note that somehow the int became float...)

This is currently a limitation of pandas that missing values can only be represented for floats, see http://pandas.pydata.org/pandas-docs/stable/gotchas.html#support-for-integer-na

Comment From: jreback

pls read the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#selection-by-label

.loc is for label based indexing. The neg indices are not found and reindexed to NaN. An integer index by-definition is label based indexed with .loc.

.iloc is always positional indexed.

Comment From: kingjr

Thanks. Is there a reason for not allowing label based indexing to interpret negative integers?

Comment From: jorisvandenbossche

Is there a reason for not allowing label based indexing to interpret negative integers?

Because then it would not be label based anymore, but position based

Comment From: kingjr

I understand. So to modify the last two lines of a particular column, what would you recommend? Something like this?

df.loc[df.index[-2:], 'column'] = range(2)

Comment From: jorisvandenbossche

yes, that is fine