Code Sample, a copy-pastable example if possible
>>> pr = pd.period_range(start='2017-01-01 00:00:00', periods=999999, freq='T')
>>> df = pd.DataFrame({"value": range(999999)}, index=pr)
>>> df.loc[:"2018-01-01 00:00"]
value
2017-01-01 00:00 0
2017-01-01 00:01 1
2017-01-01 00:02 2
...
2017-12-31 23:59 525599
2018-01-01 00:00 525600
Everything will be fine if the value of periods
parameter is < 1000000.
So when I hit it with 1000000:
>>> pr = pd.period_range(start='2017-01-01 00:00:00', periods=1000000, freq='T')
>>> df = pd.DataFrame({"value": range(1000000)}, index=pr)
>>> df.loc[:"2018-01-01 00:00"]
Error occurs:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/violetvivirand/.local/share/virtualenvs/rewind-imputation-UYiSILLm/lib/python3.6/site-packages/pandas/core/indexing.py", line 1373, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "/Users/violetvivirand/.local/share/virtualenvs/rewind-imputation-UYiSILLm/lib/python3.6/site-packages/pandas/core/indexing.py", line 1581, in _getitem_axis
return self._get_slice_axis(key, axis=axis)
File "/Users/violetvivirand/.local/share/virtualenvs/rewind-imputation-UYiSILLm/lib/python3.6/site-packages/pandas/core/indexing.py", line 1406, in _get_slice_axis
slice_obj.step, kind=self.name)
File "/Users/violetvivirand/.local/share/virtualenvs/rewind-imputation-UYiSILLm/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3454, in slice_indexer
kind=kind)
File "/Users/violetvivirand/.local/share/virtualenvs/rewind-imputation-UYiSILLm/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3661, in slice_locs
end_slice = self.get_slice_bound(end, 'right', kind)
File "/Users/violetvivirand/.local/share/virtualenvs/rewind-imputation-UYiSILLm/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3585, in get_slice_bound
slc = self._get_loc_only_exact_matches(label)
File "/Users/violetvivirand/.local/share/virtualenvs/rewind-imputation-UYiSILLm/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3554, in _get_loc_only_exact_matches
return self.get_loc(key)
File "/Users/violetvivirand/.local/share/virtualenvs/rewind-imputation-UYiSILLm/lib/python3.6/site-packages/pandas/core/indexes/period.py", line 812, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 125, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 365, in pandas._libs.index._bin_search
File "pandas/_libs/period.pyx", line 717, in pandas._libs.period._Period.__richcmp__
TypeError: Cannot compare type 'Period' with type 'int64'
Problem description
Don't know what causes the error when the amount of PeriodIndex is more than 1000000. It should work without any problem, isn't it 😕❓
Expected Output
It should returns the data from 2017-01-01 00:00 to 2018-01-01 00:00:
value
2017-01-01 00:00 0
2017-01-01 00:01 1
2017-01-01 00:02 2
...
2017-12-31 23:59 525599
2018-01-01 00:00 525600
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.0.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: zh_TW.UTF-8
LOCALE: zh_TW.UTF-8
pandas: 0.21.0
pytest: None
pip: 9.0.1
setuptools: 36.6.0
Cython: None
numpy: 1.13.3
scipy: None
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
Comment From: jreback
xref #17755, cc @Licht-T
You are doing partial string indexing on a Period. This has a number of open issues (meaning its not fully supported).
duplicated by https://github.com/pandas-dev/pandas/issues/13429 / #15920