Code Sample
df = pd.DataFrame({"date": ['10000101', '20180220']})
# Timestamp limitations correctly raise exception
pd.to_datetime(df.date, errors='raise')
...
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1000-01-01 00:00:00
# Timestamp limitations correctly coerce to NaT
pd.to_datetime(df.date, errors='coerce')
...
0 NaT
1 2018-02-20
Name: date, dtype: datetime64[ns]
# errors=`ignore` incorrectly results in datetime like object
pd.to_datetime(df.date, errors='ignore', format="%Y%m%d")
0 1000-01-01 00:00:00
1 2018-02-20 00:00:00
Name: date, dtype: object
Problem description
I believe that when errors='ignore'
, and the timestamp limitations are violated by a datum, the result should be the original input – not a datetime like object.
Expected Output
pd.to_datetime(df.date, errors='ignore', format="%Y%m%d")
0 '10000101'
1 2018-02-20 00:00:00
Name: date, dtype: object
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 3.5.2.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8
pandas: 0.21.0 pytest: 2.9.2 pip: 9.0.1 setuptools: 27.2.0 Cython: 0.24.1 numpy: 1.11.1 scipy: 0.18.1 pyarrow: None xarray: None IPython: 5.1.0 sphinx: 1.4.6 patsy: 0.4.1 dateutil: 2.5.3 pytz: 2016.6.1 blosc: None bottleneck: 1.1.0 tables: 3.2.3.1 numexpr: 2.6.1 feather: None matplotlib: 1.5.3 openpyxl: 2.3.2 xlrd: 1.0.0 xlwt: 1.2.0 xlsxwriter: 0.9.3 lxml: 3.6.4 bs4: 4.5.1 html5lib: None sqlalchemy: 1.0.13 pymysql: None psycopg2: None jinja2: 2.8 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None
Comment From: jreback
this is a duplicate of #14487 @connercowling would you like to do a PR for this?