Code Sample, a copy-pastable example if possible
In [37]: d
Out[37]: array(['2017-01-01', 'NaT', '-5877611-06-21'], dtype='datetime64[D]')
In [38]: pd.Series(d)
---------------------------------------------------------------------------
OutOfBoundsDatetime Traceback (most recent call last)
<ipython-input-38-10a0fd2c2cf9> in <module>()
----> 1 pd.Series(d)
/usr/anaconda/envs/pandas021/lib/python2.7/site-packages/pandas/core/series.pyc in __init__(self, data, index, dtype, name, copy, fastpath)
264 raise_cast_failure=True)
265
--> 266 data = SingleBlockManager(data, index, fastpath=True)
267
268 generic.NDFrame.__init__(self, data, fastpath=True)
/usr/anaconda/envs/pandas021/lib/python2.7/site-packages/pandas/core/internals.pyc in __init__(self, block, axis, do_integrity_check, fastpath)
4395 if not isinstance(block, Block):
4396 block = make_block(block, placement=slice(0, len(axis)), ndim=1,
-> 4397 fastpath=True)
4398
4399 self.blocks = [block]
/usr/anaconda/envs/pandas021/lib/python2.7/site-packages/pandas/core/internals.pyc in make_block(values, placement, klass, ndim, dtype, fastpath)
2950 placement=placement, dtype=dtype)
2951
-> 2952 return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
2953
2954 # TODO: flexible with index=None and/or items=None
/usr/anaconda/envs/pandas021/lib/python2.7/site-packages/pandas/core/internals.pyc in __init__(self, values, placement, fastpath, **kwargs)
2463 def __init__(self, values, placement, fastpath=False, **kwargs):
2464 if values.dtype != _NS_DTYPE:
-> 2465 values = tslib.cast_to_nanoseconds(values)
2466
2467 super(DatetimeBlock, self).__init__(values, fastpath=True,
pandas/_libs/tslib.pyx in pandas._libs.tslib.cast_to_nanoseconds()
pandas/_libs/tslib.pyx in pandas._libs.tslib._check_dts_bounds()
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: -5877611-06-21 00:00:00
Problem description
Trying to create a series from a datetime.
I'm not sure if this is a bug, or desired behavior, and I understand that there are some bounds to pandas Timestamp. But i'm not sure what the best way to approach the issue is if I don't have the ability to control the input. I can possibly intercept the data in-flight and squash the values down between pandas.Timestamp.max and .min. But I'm hoping there is a better way.
Thanks.
Comment From: TomAugspurger
I'd say "expected behavior" rather than "desired" :) Pandas uses nanosecond resolution for all datetimes.
I can possibly intercept the data in-flight and squash the values down between pandas.Timestamp.max and .min
That sounds like one possible option. Alternatively, pd.to_datetime
provides the errors='coerce'
option, which will replace invalid values with NaT
.