Hello the Pandas team,

I ran this morning into some nasty bug with Pandas 18.1. I was not able to find whether you are aware of it or not. Here it is

idx=[pd.to_datetime('2012-02-27 14:23:00') , pd.to_datetime('2012-08-27 14:33:00'), pd.to_datetime('2012-02-27 14:23:00')]
test=pd.DataFrame({'A':['one','two', 'three']},index=idx)
test['alt']=test.index.tz_localize('UTC').tz_convert('US/Eastern').tz_localize(None) 
Out[9]: 
                         A                 alt
2012-02-27 14:23:00    one 2012-02-27 09:23:00
2012-08-27 14:33:00    two 2012-08-27 10:33:00
2012-02-27 14:23:00  three 2012-02-27 10:23:00 

where the third column is obviously wrong. What have I done wrong here? Are indices not meant to work with duplicate values?

Thanks!

Comment From: jreback

there are a couple of bug fixes w.r.t. DST in 0.19.0; this RC has been out for a week or so https://github.com/pydata/pandas/releases/tag/v0.19.0rc1

In [23]: test
Out[23]:
                         A
2012-02-27 14:23:00    one
2012-08-27 14:33:00    two
2012-02-27 14:23:00  three

In [24]: test['alt']=test.index.tz_localize('UTC').tz_convert('US/Eastern').tz_localize(None)

In [25]: test
Out[25]:
                         A                 alt
2012-02-27 14:23:00    one 2012-02-27 09:23:00
2012-08-27 14:33:00    two 2012-08-27 10:33:00
2012-02-27 14:23:00  three 2012-02-27 09:23:00

Comment From: randomgambit

got it. my workaround is to use apply and tz_localize / tz_convert on a column that contains the timestamp (instead of working with the index). That seems to be working. Does that make sense?