Code Sample, a copy-pastable example if possible

In [1]: import pandas as pd
In [2]: pd.Timestamp("2016-01-01", tz="Europe/Berlin") - pd.Timestamp("now", tz="UTC")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-691a8df26ecd> in <module>()
----> 1 pd.Timestamp("2016-01-01", tz="Europe/Berlin") - pd.Timestamp("now", tz="UTC")

pandas\tslib.pyx in pandas.tslib._Timestamp.__sub__ (pandas\tslib.c:23697)()

TypeError: Timestamp subtraction must have the same timezones or no timezones

Problem description

If both timestamps have a timezone specified, the result of this operation is perfectly well-defined. It's quite surprising that I have to riddle my code with lhs.tz_convert("UTC") - rhs.tz_convert("UTC") lines to get the difference of timestamps.

Expected Output

Timedelta('-393 days +06:29:07.057926')

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None pandas: 0.19.2 nose: 1.3.7 pip: 9.0.1 setuptools: 27.2.0 Cython: 0.25.1 numpy: 1.11.2 scipy: None statsmodels: None xarray: None IPython: 5.1.0 sphinx: 1.5.1 patsy: None dateutil: 2.6.0 pytz: 2016.10 blosc: None bottleneck: None tables: None numexpr: None matplotlib: None openpyxl: 2.4.0 xlrd: None xlwt: None xlsxwriter: 0.9.6 lxml: None bs4: None html5lib: None httplib2: None apiclient: None sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.8.1 boto: None pandas_datareader: None

Comment From: jreback

we could certainly make this work

generally though if someone is trying to subtract different time zones it's an error (they meant to convert)

but maybe that's not typical

Comment From: Vutsuak16

@jreback should it be implemented for 'addition operation' too? Because it will raise the same error while adding two timestamps of different timezones AND which side of the operator should be given preference here, i.e. to subtract we ultimately need them to be in same timezones. So should tz=Europe/Berlin should be converted to UTC or UTC should be converted to Europe/Berlin?

Comment From: jreback

addition of timestamps is not a meaningful operation

you will have to handle both sub and rsub both are converted to UTC before subtraction

Comment From: jreback

I am actually now a bit -0 on this as this is a touch too magical. This should be an explicit operation. In reality it doesn't come up that much, so making this a user-explicit operation should not be much of a burden.

But will leave the issue open for discussion / implementation.

Comment From: jreback

closing this, we are moving towards things being very explicit.

Comment From: filmor

I really don't see any magic here. Every non-naive Timestamp specifies a unique point on a common time-axis, completely independent of the timezone being used, so their difference is unambiguously defined as the distance on that time-axis.

Contrary to what you said before, subtracting timestamps with different timezones /is/ a valid action that can happen in particular in library code, e.g. if I provide a function time_to_some_event(ts), where event is given by some backend system's value (i.e. machine to machine communication, so usually UTC) when ts comes from user code, so it will probably have a non-UTC timezone attached.

If you want to be explicit, what you should not allow is subtraction of naive timestamps, as this gives you essentially random values if the naive timestamps don't happen to be UTC.

Comment From: jreback

we raise for all comparisons between differing tz's (whether tz is UTC or naive or another zone).

this is complicated by the fact that we generally turn strings into naive timestamps, xref

xref #18435

subtracting in different timezones would be valid, but is just plain confusing, forcing folks to put things in the same timezone to subtract is not very burdensome and is much much more explicit.

Comment From: jorisvandenbossche

we raise for all comparisons between differing tz's (whether tz is UTC or naive or another zone).

That is not really true, for comparisons we allow this:

In [13]: pd.Timestamp('2016-01-01', tz='Europe/Brussels') > pd.Timestamp('2017-01-01', tz='UTC')
Out[13]: False

In [14]: pd.DatetimeIndex([pd.Timestamp('2016-01-01', tz='Europe/Brussels')]) > pd.Timestamp('2017-01-01', tz='UTC')
Out[14]: array([False], dtype=bool)

In [15]: pd.Series([pd.Timestamp('2016-01-01', tz='Europe/Brussels')]) > pd.Timestamp('2017-01-01', tz='UTC')
Out[15]: 
0    False
dtype: bool

In [16]: pd.Series([pd.Timestamp('2016-01-01', tz='Europe/Brussels')]) > pd.Series([pd.Timestamp('2017-01-01', tz='UTC')])
Out[16]: 
0    False
dtype: bool

Comment From: jreback

see my comment in the linked issue these should all raise (on the list)

Comment From: jorisvandenbossche

these should all raise

I would say that is up for debate. We currently allow it, and I don't think there is anything ambiguous about what the result should be. Why breaking backwards compatibility to start erroring on this?

Comment From: filmor

Also, the actual issue in the linked comment is about tz-naive vs tz-aware timestamps. Of course neither comparison nor difference make sense if tz-naive timestamps are involved (even naive vs naive dubious, cf. pd.Timestamp("DST-Day 02:30") < pd.Timestamp("DST-Day 02:31")).

Comment From: foolcage

They are just time..and should always contain tz info natively,and so they could compare.

Comment From: guyer

I know this is long closed, but if all of your Timestamps are in the same timezone, why are you squandering cycles on time zones at all? They're only relevant if you have different events in different time zones.