Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
from datetime import datetime
import pandas as pd
# Create a datetime object (midnight)
dt = datetime(2015, 1, 1, 0)
# Create a pd.Timedelta object (1 hour)
delta_1 = pd.Timedelta("1H")
# Create a pd.Timedelta object from a DateTimeIndex frequency (1 hour)
delta_2 = pd.to_timedelta(pd.date_range('2015-01-01 00', '2015-01-01 01', freq="1H").freq)
# First check that the pd.Timedelta objects are equal
assert(delta_1 == delta_2) # True
# Then check whether adding each pd.Timedelta to the datetime object moves the datetime by one hour: this does not behave correctly since Pandas 2.0
assert(dt + delta_1 == dt + delta_2) # False -> AssertionError
# Left-hand side looks right: the hour was added
dt + delta_1 # datetime.datetime(2015, 1, 1, 1, 0)
# Right-hand side appears to be wrong: the hour was not added
dt + delta_2 # datetime.datetime(2015, 1, 1, 0, 0)
Issue Description
It seems that the pd.Timedelta
object produced by calling pd.to_timedelta
on a pd.DateTimeIndex
frequency can no longer be added to a datetime.datetime
object since pandas>1.5.3
.
It's not evident to me why the two pd.Timedelta
objects (appearing to be equal) are behaving differently.
Expected Behavior
Running the above example in pandas==1.5.3
(and also 1.4.4) works as expected.
Installed Versions
Comment From: jbrockmendel
I have an explanation but not a solution.
delta_2 here has unit="s". Timedeltas with unit="s" can go outside the range supported by datetime.timedelta, so the way they are initialized passes seconds=0 to the parent class (datetime.timedelta) constructor. So the pytimedelta struct (which pandas ignores) looks like timedelta(0). When you do dt + delta_2, that calls the pydatetime's __add__
method, which treats delta_2 as a regular pytimedelta, not a pd.Timedelta. So it uses the 0, giving the incorrect result.
Reversing the operation and doing delta_2 + dt gives the correct result, as does pd.Timestamp(dt) + delta_2.
A solution here would need to be in the stdlib datetime.__add__
returning NotImplemented
Comment From: miccoli
A solution here would need to be in the stdlib
datetime.__add__
returningNotImplemented
No way! pd.Timedelta
claims to be a datetime.timedelta
subclass:
>>> import datetime
>>> import pandas as pd
>>> issubclass(pd.Timedelta, datetime.timedelta)
True
The Liskov substitution principle requires that datetime.__add__(self, other)
is semantically equivalent, whenever other
is a datetime.timedelta
subclass.
By the way, why is pd.Timedelta
a datetime.timedelta
subclass? They are mostly incompatible. The only way forward seems to me to completely drop the claim that pd.Timedelta
should be a datetime.timedelta
subclass.
Comment From: GiorgioBalestrieri
Just caught a bug caused by this while trying to upgrade to pandas 2.0. This is both quite concerning (as mentioned above, pd.Timedelta
should behave like dt.timedelta
, since it claims to be a subclass), and potentially a blocker, because we'll have to go and ensure that each single pd.Timedelta
is converted to dt.timedelta
before doing any arithmetics on it.
I really don't want to sound like I'm demanding a fix for a bug/issue in an open source library/project, but I think solving this should be priority.