Code Sample

# Your code here

Problem description

The above code raises an exception when run:

import numpy
import pandas

array = numpy.array(['2263'], dtype='<M8[Y]')
pandas.Series(array)

This raises:

Traceback (most recent call last):
  File "pandas_bug.py", line 6, in <module>
    pandas.Series(array)
  File "/home/david/.pyenv/versions/3.6.1/lib/python3.6/site-packages/pandas/core/series.py", line 250, in __init__
    data = SingleBlockManager(data, index, fastpath=True)
  File "/home/david/.pyenv/versions/3.6.1/lib/python3.6/site-packages/pandas/core/internals.py", line 4117, in __init__
    fastpath=True)
  File "/home/david/.pyenv/versions/3.6.1/lib/python3.6/site-packages/pandas/core/internals.py", line 2719, in make_block
    return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
  File "/home/david/.pyenv/versions/3.6.1/lib/python3.6/site-packages/pandas/core/internals.py", line 2221, in __init__
    values = tslib.cast_to_nanoseconds(values)
  File "pandas/_libs/tslib.pyx", line 4053, in pandas._libs.tslib.cast_to_nanoseconds (pandas/_libs/tslib.c:69042)
  File "pandas/_libs/tslib.pyx", line 1774, in pandas._libs.tslib._check_dts_bounds (pandas/_libs/tslib.c:32752)
pandas._libs.tslib.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2263-01-01 00:00:00

This is presumably because pandas is changing the dtype from that of the original array by ignoring the precision argument to the datetime dtype. The following code demonstrates this:

import numpy
import pandas

array = numpy.array([], dtype='<M8[Y]')
print(array.dtype)
print(pandas.Series(array).dtype)

This prints:

datetime64[Y]
datetime64[ns]

The year precision on the datetime64 is being lost and changed to a nanoseconds precision, but 2263 is more nanoseconds past the epoch than can fit in a 64-bit integer so we get an overflow.

Expected Output

I would expect the resulting series to retain the dtype of the original array, which would avoid this problem entirely.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-43-Microsoft machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 pandas: 0.20.3 pytest: 3.0.7 pip: 9.0.1 setuptools: 35.0.1 Cython: None numpy: 1.13.1 scipy: None xarray: None IPython: 6.0.0 sphinx: None patsy: None dateutil: 2.6.1 pytz: 2017.2 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: None openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None sqlalchemy: None pymysql: None psycopg2: None jinja2: None s3fs: None pandas_gbq: None pandas_datareader: None

Comment From: TomAugspurger

Pandas only supports nanosecond-precision with Timestamps, see here.

I think the best we can do is to raise a better error message here.

Comment From: DRMacIver

Ah, I see. Apologies, I clearly should have looked for documentation of this behaviour.

Comment From: jreback

closing. this is clearly documented.