Code Sample, a copy-pastable example if possible
import pandas as pd
ts = pd.Series([pd.Timestamp('2017-07-31 20:08:46.110998-04:00'),
pd.Timestamp('2017-08-01 20:08:46.110998-04:00'),
pd.Timestamp('2017-08-02 20:08:46.110998-04:00')])
def func(elem):
print(type(elem))
return elem
print(type(ts))
print(type(ts[0]))
ts.apply(func);
# Prints out:
# <class 'pandas.core.series.Series'>
# <class 'pandas._libs.tslib.Timestamp'>
# <class 'pandas.core.indexes.datetimes.DatetimeIndex'>
Problem description
I have a Series with Timestamps as values rather than the index. I expect the apply method to be called on each element, but it is not, rather it gets called on a DatetimeIndex.
Expected Output
Output of pd.show_versions()
Comment From: nathanielatom
Also, some info on my use case:
I want to apply the tz_localize method to each Timestamp in the series. I originally tried tz_localize on the series itself, but that raised
TypeError: index is not a valid DatetimeIndex or PeriodIndex
I realize it is possible to achieve this by using reindex, but I was wondering if it was possible to do this with Timestamps as Series values as well.
Comment From: jorisvandenbossche
@nathanielatom you can use tz_localize/tz_convert on the Series through the dt accessor:
In [19]: ts.dt.tz_convert('UTC')
Out[19]:
0 2017-08-01 00:08:46.110998+00:00
1 2017-08-02 00:08:46.110998+00:00
2 2017-08-03 00:08:46.110998+00:00
dtype: datetime64[ns, UTC]
Comment From: jorisvandenbossche
Further, the reason you get the output you see with apply, is because apply will first try to invoke the function on all values (which are holded under the hood as a DatetimeIndex, although it are the values of the Series), and only if that fails, will call the function on each element.
If you adapt the function a little bit to raise when it doesn't get a scalar value, you see the expected output:
In [21]: def func(elem):
...: assert not hasattr(elem, 'ndim')
...: print(type(elem))
...: return elem
...:
In [22]: ts.apply(func)
<class 'pandas._libs.tslib.Timestamp'>
<class 'pandas._libs.tslib.Timestamp'>
<class 'pandas._libs.tslib.Timestamp'>
Out[22]:
0 2017-07-31 20:08:46.110998-04:00
1 2017-08-01 20:08:46.110998-04:00
2 2017-08-02 20:08:46.110998-04:00
dtype: datetime64[ns, pytz.FixedOffset(-240)]