-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[X] I have confirmed this bug exists on the master branch of pandas.
Reproducible Example
import pandas as pd
import datetime, pytz
# t is a date after 2262 that should be coerced into NaT
t = datetime.datetime(3000, 1, 1, 0, 0, 0, 0, pytz.UTC)
# lists with many such dates
l50 = [t] * 50
l51 = [t] * 51
# pd.to_datetime crashes if the list is longer that 50
# this is fine
print(pd.to_datetime(l50, utc=True, errors='coerce'))
# this crashes
print(pd.to_datetime(l51, utc=True, errors='coerce'))
Issue Description
Background (irrelevant for reproducing the bug, but good to know):
A google bigquery gives me dataframe with a column with datetime values with year=9999,
and I store the result as a parquet file. When I read it back, read_parquet
crashes.
I traced it down to the following problem in pd.to_datetime.
The issue: if pd.datetime needs to convert more than 50 datetime values with tzinfo set, and those values are above year 2262 (the maximal date for pd.Timestamp) and should be coerced into NaT, then pd.to_datetime crashes with a raised exception. It works fine if the series is smaller than 51 items.
Expected Behavior
The expected behavior is that pd.to_datetime should return a list of NaT values also for Sequences longer than 50 items.
Installed Versions
Comment From: jreback
hmm try this on master not sure if we back ported this patch
Comment From: debnathshoham
this is still failing on master
Comment From: larsr
The exception is raised on line 467 in array_to_datetime in tslib.pyx
It raises the exception because (four lines above) the value of utc_convert
is False
.
The value is False
because the call to objects_to_datetime64ns
on line 2066 does not fill in a value for the utc
parameter (which has default value False
).
So when objects_to_datetime
later calls array_to_datetime
it uses utc=False
.
Comment From: thomasqueirozb
take
Comment From: srotondo
I believe that this is a duplicated issue that was fixed during the closing of #45319, so this issue can be closed, right?
Comment From: debnathshoham
this works fine on master now
Comment From: phofl
agreed, this was fixed and tested in the other issue