Code Sample, a copy-pastable example if possible
import pandas as pd
ts = pd.Series([pd.Timestamp('2017-07-31 20:08:46.110998-04:00'),
pd.Timestamp('2017-08-01 20:08:46.110998-04:00'),
pd.Timestamp('2017-08-02 20:08:46.110998-04:00')])
def func(elem):
print(type(elem))
return elem
print(type(ts))
print(type(ts[0]))
ts.apply(func);
# Prints out:
# <class 'pandas.core.series.Series'>
# <class 'pandas._libs.tslib.Timestamp'>
# <class 'pandas.core.indexes.datetimes.DatetimeIndex'>
Problem description
I have a Series with Timestamps as values rather than the index. I expect the apply method to be called on each element, but it is not, rather it gets called on a DatetimeIndex.
Expected Output
Output of pd.show_versions()
Comment From: nathanielatom
Also, some info on my use case:
I want to apply the tz_localize
method to each Timestamp in the series. I originally tried tz_localize
on the series itself, but that raised
TypeError: index is not a valid DatetimeIndex or PeriodIndex
I realize it is possible to achieve this by using reindex
, but I was wondering if it was possible to do this with Timestamps as Series values as well.
Comment From: jorisvandenbossche
@nathanielatom you can use tz_localize
/tz_convert
on the Series through the dt
accessor:
In [19]: ts.dt.tz_convert('UTC')
Out[19]:
0 2017-08-01 00:08:46.110998+00:00
1 2017-08-02 00:08:46.110998+00:00
2 2017-08-03 00:08:46.110998+00:00
dtype: datetime64[ns, UTC]
Comment From: jorisvandenbossche
Further, the reason you get the output you see with apply
, is because apply will first try to invoke the function on all values (which are holded under the hood as a DatetimeIndex, although it are the values of the Series), and only if that fails, will call the function on each element.
If you adapt the function a little bit to raise when it doesn't get a scalar value, you see the expected output:
In [21]: def func(elem):
...: assert not hasattr(elem, 'ndim')
...: print(type(elem))
...: return elem
...:
In [22]: ts.apply(func)
<class 'pandas._libs.tslib.Timestamp'>
<class 'pandas._libs.tslib.Timestamp'>
<class 'pandas._libs.tslib.Timestamp'>
Out[22]:
0 2017-07-31 20:08:46.110998-04:00
1 2017-08-01 20:08:46.110998-04:00
2 2017-08-02 20:08:46.110998-04:00
dtype: datetime64[ns, pytz.FixedOffset(-240)]