pd.to_datetime(['9000-01-01'])
shows
<ipython-input-1-1af851f16643>:1: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
and then also errors
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 9000-01-01, at position 0
However, the format could be inferred (%Y-%m-%d
), it just can't be parsed as nanosecond timestamp.
The warning should probably not be shown in this case.
I'll make a PR, it's an easy change
EDIT: it's not quite as trivial as I initially thought - still, not a blocker for 2.0 by any stretch of the imagination
Comment From: MarcoGorelli
This isn't really worth it
Removing the check of whether the guessed format parses barely improves performance:
upstream/main:
In [1]: %%timeit
...: to_datetime(['2020-01-01'])
...:
...:
175 µs ± 3.33 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
With the check removed:
In [1]: %%timeit
...: to_datetime(['2020-01-01'])
...:
...:
161 µs ± 2.16 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
And note that the check only needs doing once per to_datetime
call, so if there's a handful of elements then the time difference becomes really negligible
Closing then, this probably isn't worth pursuing for now. May revisit later