Pandas BUG: .iloc does not accept boolean series

Code Sample, a copy-pastable example if possible

>>> s = pd.Series([3,4,5])
>>> b = s > 3
>>> b  # a boolean series
0    False
1     True
2     True
dtype: bool

Now, various indexing method with b:

>>> s.loc[b]  # ok
>>> s[b]  # ok
>>> s.iloc[b.values]  # ok
>>>s.iloc[b]  # not ok
NotImplementedError: iLocation based boolean indexing on an integer type is not available

Problem description

As the indexes of s and b are the same and the b. values are booleans, .iloc should work here (unless I'm missing something non-obvious). That is, it should join up the indices as per the panda rules, and then do a boolean filter.

I did some poking around the error frames, and the start function is pandas.core.indexing._iLocIndexer._getitem_axis. The error appears to be that pandas.core.indexing._iLocIndexer._has_valid_type(key, axis) check function is overzealous. If we call pandas.core.indexing._iLocIndexer._getbool_axis(key, axis=axis) without doing the check in ._has_valid_type, everything comes out fine.

It seems to me that pandas.core.indexing._iLocIndexer._has_valid_type is too strict with boolean series. Is this correct?

Expected Output

s filtered by b.

Output of `pd.show_versions()`

[paste the output of ``pd.show_versions()`` here below this line] INSTALLED VERSIONS ------------------ commit: f20721940e7be5d741d158fa3e9272b6cbbaa3b5 python: 3.6.2.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None pandas: 0.20.3 pytest: 3.2.1 pip: 9.0.1 setuptools: 36.3.0 Cython: 0.25.2 numpy: 1.13.1 scipy: 0.19.1 xarray: None IPython: 6.1.0 sphinx: 1.6.2 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.0.2 openpyxl: 2.4.5 xlrd: 1.0.0 xlwt: None xlsxwriter: None lxml: 3.7.3 bs4: 4.5.3 html5lib: 0.999999999 sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: None

Comment From: jreback

this is by-definition not allowed. .iloc is purely positional, so it doesn't make sense to align with a passed Series (which is what all indexing operations do).

Comment From: jreback

http://pandas.pydata.org/pandas-docs/stable/indexing.html#selection-by-position indicates only a boolean array is allowed.

Comment From: nick-ulle

If the model is that .iloc is purely positional, then why can't it do a purely positional selection of elements with a Boolean Series of the same length?

Is there a technical limitation or a deeper design decision here? Is it to prevent users who don't understand the model for .iloc from using it with a Boolean Series and thinking it matched the indexes?

Pandas BUG: .iloc does not accept boolean series

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

Output of `pd.show_versions()`