Pandas DateOffset + (Timestamp Series with NaT) = wrong results

# Your code here
import pandas as pd
t = pd.Series([pd.Timestamp("2016-09-01", tz="US/Eastern") + k*pd.Timedelta("00:01:00") for k in xrange(5)])
t.iloc[2] = pd.NaT
print t
print
print t + pd.tseries.offsets.DateOffset(days=1)

# Observed output:
0   2016-09-01 00:00:00-04:00
1   2016-09-01 00:01:00-04:00
2                         NaT
3   2016-09-01 00:03:00-04:00
4   2016-09-01 00:04:00-04:00
dtype: datetime64[ns, US/Eastern]

0   2016-09-02 04:00:00-04:00
1   2016-09-02 04:01:00-04:00
2                         NaT
3   2016-09-02 04:03:00-04:00
4   2016-09-02 04:04:00-04:00
dtype: datetime64[ns, US/Eastern]

Problem description

Adding a DateOffset object to a series with NaT values makes the output incorrect for the entire series. In the example above, I added one day to the series but it jumps by one day and four hours. This happens for other kinds of DateOffset objects (1 minute, 3 months, business day, etc). The incorrect jump seems to be related to the time zone.

Expected Output

0   2016-09-01 00:00:00-04:00
1   2016-09-01 00:01:00-04:00
2                         NaT
3   2016-09-01 00:03:00-04:00
4   2016-09-01 00:04:00-04:00
dtype: datetime64[ns, US/Eastern]

0   2016-09-02 00:00:00-04:00
1   2016-09-02 00:01:00-04:00
2                         NaT
3   2016-09-02 00:03:00-04:00
4   2016-09-02 00:04:00-04:00
dtype: datetime64[ns, US/Eastern]

Output of `pd.show_versions()`

# Paste the output here pd.show_versions() here commit: None python: 2.7.13.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None pandas: 0.19.2 nose: 1.3.7 pip: 9.0.1 setuptools: 27.2.0 Cython: 0.25.2 numpy: 1.11.3 scipy: 0.18.1 statsmodels: 0.6.1 xarray: None IPython: 5.1.0 sphinx: 1.5.1 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2016.10 blosc: None bottleneck: 1.2.0 tables: 3.2.2 numexpr: 2.6.1 matplotlib: 1.5.1 openpyxl: 2.4.0 xlrd: 1.0.0 xlwt: 1.1.2 xlsxwriter: 0.9.6 lxml: 3.7.2 bs4: 4.5.3 html5lib: None httplib2: None apiclient: None sqlalchemy: 1.1.4 pymysql: None psycopg2: None jinja2: 2.8.1 boto: 2.45.0 pandas_datareader: 0.2.1

Comment From: jreback

this is correct in master; IIRC fixed by https://github.com/pandas-dev/pandas/pull/14928

In [8]: import pandas as pd
   ...: t = pd.Series([pd.Timestamp("2016-09-01", tz="US/Eastern") + k*pd.Timedelta("00:01:00") for k in range(5)])
   ...: t.iloc[2] = pd.NaT
   ...: 

In [9]: t
Out[9]: 
0   2016-09-01 00:00:00-04:00
1   2016-09-01 00:01:00-04:00
2                         NaT
3   2016-09-01 00:03:00-04:00
4   2016-09-01 00:04:00-04:00
dtype: datetime64[ns, US/Eastern]

In [10]: t + pd.tseries.offsets.DateOffset(days=1)
Out[10]: 
0   2016-09-02 00:00:00-04:00
1   2016-09-02 00:01:00-04:00
2                         NaT
3   2016-09-02 00:03:00-04:00
4   2016-09-02 00:04:00-04:00
dtype: datetime64[ns, US/Eastern]

In [11]: (t + pd.tseries.offsets.DateOffset(days=1)) - t
Out[11]: 
0   1 days
1   1 days
2      NaT
3   1 days
4   1 days
dtype: timedelta64[ns]

Pandas DateOffset + (Timestamp Series with NaT) = wrong results

Problem description

Expected Output

Output of pd.show_versions()

Output of `pd.show_versions()`