Description
I have DataFrame with a datetime index and two dtype-float columns.
I wanted to use a groupby function with an EWMA such that the EWMA works on each hour of the day separately. I used the below code and all seemed to work as expected. The issue is that a warning message appeared: FutureWarning: pd.ewm_mean is deprecated for DataFrame and will be removed in a future version, replace with DataFrame.ewm(adjust=True, min_periods=0, ignore_na=False, alpha=0.84).mean()
My currently working code
data.groupby([data.index.hour]).apply(lambda x: pd.ewma(x, alpha=0.84))
The new code as kind of suggested
data.groupby([data.index.hour]).apply(lambda x: pd.ewm(alpha=0.84).mean(x))
AttributeError: module 'pandas' has no attribute 'ewm'
What I also tried first of all (before applying pd.ewma
)
data.groupby([data.index.hour]).ewm(alpha=0.84).mean(x)
AttributeError: Cannot access callable attribute 'ewm' of 'DataFrameGroupBy' objects, try using the 'apply' method
Usually, with this kind of thing it is me not using the intuitive pandas functionality correctly. This time it seems that the change to df.ewm().mean()
(which admittedly does feel much neater) is actually losing functionality or at the least, an intuitive solution to a problem.
I think the perfect solution would be to have df.groupby().ewm().mean()
working as this seems most intuitive as a a step onward from df.groupby().rolling().mean()
.
Output of pd.show_versions()
Comment From: max-sixty
I haven't gone into detail, but have you tried data.groupby([data.index.hour]).apply(lambda x: x.ewm(alpha=0.84).mean())
? That's the translation from old to new
Comment From: jreback
well the warning is pretty clear
In [11]: pd.options.display.max_rows=12
In [12]: df = pd.DataFrame({'A' : np.random.randn(100)},index=pd.date_range('20130101',periods=100,freq='H'))
In [13]: df
Out[13]:
A
2013-01-01 00:00:00 0.531136
2013-01-01 01:00:00 -1.586308
2013-01-01 02:00:00 0.956570
2013-01-01 03:00:00 -1.314899
2013-01-01 04:00:00 1.225789
2013-01-01 05:00:00 0.152171
... ...
2013-01-04 22:00:00 -1.390932
2013-01-04 23:00:00 0.127784
2013-01-05 00:00:00 0.276486
2013-01-05 01:00:00 2.265782
2013-01-05 02:00:00 0.590485
2013-01-05 03:00:00 -0.781056
[100 rows x 1 columns]
In [14]: df.groupby([df.index.hour]).apply(lambda x: pd.ewma(x, alpha=0.84))
/Users/jreback/miniconda3/envs/main/bin/ipython:1: FutureWarning: pd.ewm_mean is deprecated for DataFrame and will be removed in a future version, replace with
DataFrame.ewm(alpha=0.84,min_periods=0,ignore_na=False,adjust=True).mean()
#!/bin/bash /Users/jreback/miniconda3/envs/main/bin/python.app
Out[14]:
A
2013-01-01 00:00:00 0.531136
2013-01-01 01:00:00 -1.586308
2013-01-01 02:00:00 0.956570
2013-01-01 03:00:00 -1.314899
2013-01-01 04:00:00 1.225789
2013-01-01 05:00:00 0.152171
... ...
2013-01-04 22:00:00 -1.061139
2013-01-04 23:00:00 -0.058194
2013-01-05 00:00:00 0.226075
2013-01-05 01:00:00 1.824656
2013-01-05 02:00:00 0.233793
2013-01-05 03:00:00 -0.422649
[100 rows x 1 columns]
In [15]: df.groupby([df.index.hour]).apply(lambda x: x.ewm(alpha=0.84).mean())
Out[15]:
A
2013-01-01 00:00:00 0.531136
2013-01-01 01:00:00 -1.586308
2013-01-01 02:00:00 0.956570
2013-01-01 03:00:00 -1.314899
2013-01-01 04:00:00 1.225789
2013-01-01 05:00:00 0.152171
... ...
2013-01-04 22:00:00 -1.061139
2013-01-04 23:00:00 -0.058194
2013-01-05 00:00:00 0.226075
2013-01-05 01:00:00 1.824656
2013-01-05 02:00:00 0.233793
2013-01-05 03:00:00 -0.422649
[100 rows x 1 columns]