Pandas Wrong slicing behavior for DataFrame loc

Code Sample, a copy-pastable example if possible

>>> df
   0   1   2   3
0  1  21  51  61
1  2  22  52  62
2  3  23  53  63
>>> df[0:0]
Empty DataFrame
Columns: [0, 1, 2, 3]
Index: []
>>> df.loc[0:0]
   0   1   2   3
0  1  21  51  61

For loc, slicing is incorrect

The behavior exhibited by slicing of loc is incosistent with python array slicing. For [0:0] it should have returned empty, but it is returning a row.

Expected Output

Similar to code show below for python array, pd.DataFrame.loc slicing should produce empty

E.g. Following code slices empty for [0:0]

>>> l = [0,1,2]
>>> l[0:0]
[]

Output of `pd.show_versions()`

pd.show_versions()

INSTALLED VERSIONS

commit: None python: 3.5.2.final.0 python-bits: 64 OS: Darwin OS-release: 16.1.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.19.1 nose: None pip: 9.0.1 setuptools: 25.2.0 Cython: None numpy: 1.10.4 scipy: 0.17.0 statsmodels: None xarray: None IPython: None sphinx: None patsy: None dateutil: 2.2 pytz: 2016.3 blosc: None bottleneck: None tables: None numexpr: None matplotlib: 1.5.1 openpyxl: 2.2.0-b1 xlrd: 0.9.4 xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None httplib2: None apiclient: None sqlalchemy: 1.0.12 pymysql: None psycopg2: None jinja2: None boto: None pandas_datareader: None

Comment From: jreback

http://pandas.pydata.org/pandas-docs/stable/indexing.html#selection-by-label

label slicing always includes the endpoint

positional (iloc) slicing matches python semantics

Comment From: Ajeet-Ganga

Isn't the example below producing empty slice ? Shouldn't it be consistent with the following experience ?

>>> l = [0,1,2]
>>> l[0:0]
[]

Comment From: jreback

read the docs

Comment From: Ajeet-Ganga

What would we loose by being consistent? Just because we put something in the documentation doesn't make it right.

Performance over consistent behavior is not even a contest.

Comment From: jreback

you are missing the point

.iloc is positional indexing .loc is label indexing

these are two separate and distinct ways of selecting data numpy and python only have positional concepts and thus have only 1 way of indexing

pandas can also index by labels; since these can be for example strings (or datetimes or integers) you now have different semantics

again the docs are very complete on these concepts

and if you chose not to use label indexing, then just use .iloc and it will feel like python/numpy

Pandas Wrong slicing behavior for DataFrame loc

Code Sample, a copy-pastable example if possible

For loc, slicing is incorrect

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

Output of `pd.show_versions()`