Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
# ISO:
In [1]: pd.to_datetime(["2020-01-01 01:00:00-01:00", datetime(2020, 1, 1, 3, 0)], format='%Y-%m-%d %H:%M:%S%z')
Out[1]: DatetimeIndex(['2020-01-01 01:00:00-01:00', '2020-01-01 02:00:00-01:00'], dtype='datetime64[ns, UTC-01:00]', freq=None)
# non-ISO:
In [2]: pd.to_datetime(["2020-01-01 01:00:00-01:00", datetime(2020, 1, 1, 3, 0)], format='%Y-%d-%m %H:%M:%S%z')
Out[2]: Index([2020-01-01 01:00:00-01:00, 2020-01-01 03:00:00], dtype='object')
Issue Description
This is inconsistent. I think I prefer the latter - datetime(2020, 1, 1, 3, 0)
is timezone-naive, so it seems right to do the same thing we'd do if there were mixed offsets, i.e. return an Index of dtype object
As far as I can tell, the only place this is tested is
https://github.com/pandas-dev/pandas/blob/1d5f05c33c613508727ee7b971ad56723d474446/pandas/core/tools/datetimes.py#L993-L1000
Timezone-naive and timezone-aware are not the same format, so if we now expect elements to be parsed with the same format, I don't think this should fly
Expected Behavior
Same output in both cases, I think this should be:
In [1]: pd.to_datetime(["2020-01-01 01:00:00-01:00", datetime(2020, 1, 1, 3, 0)], format='%Y-%m-%d %H:%M:%S%z') # ISO: naive dt.datetime is converted
Out[1]: Index([2020-01-01 01:00:00-01:00, 2020-01-01 03:00:00], dtype='object')
In [2]: pd.to_datetime(["2020-01-01 01:00:00-01:00", datetime(2020, 1, 1, 3, 0)], format='%Y-%d-%m %H:%M:%S%z') # non-ISO: dt.datetime is left as-is, object Index returned
Out[2]: Index([2020-01-01 01:00:00-01:00, 2020-01-01 03:00:00], dtype='object')
Installed Versions
Comment From: jorisvandenbossche
Resolved by https://github.com/pandas-dev/pandas/pull/50242