Description

I have DataFrame with a datetime index and two dtype-float columns.

I wanted to use a groupby function with an EWMA such that the EWMA works on each hour of the day separately. I used the below code and all seemed to work as expected. The issue is that a warning message appeared: FutureWarning: pd.ewm_mean is deprecated for DataFrame and will be removed in a future version, replace with DataFrame.ewm(adjust=True, min_periods=0, ignore_na=False, alpha=0.84).mean()

My currently working code

data.groupby([data.index.hour]).apply(lambda x: pd.ewma(x, alpha=0.84))

The new code as kind of suggested

data.groupby([data.index.hour]).apply(lambda x: pd.ewm(alpha=0.84).mean(x))

AttributeError: module 'pandas' has no attribute 'ewm'

What I also tried first of all (before applying pd.ewma)

data.groupby([data.index.hour]).ewm(alpha=0.84).mean(x)

AttributeError: Cannot access callable attribute 'ewm' of 'DataFrameGroupBy' objects, try using the 'apply' method

Usually, with this kind of thing it is me not using the intuitive pandas functionality correctly. This time it seems that the change to df.ewm().mean() (which admittedly does feel much neater) is actually losing functionality or at the least, an intuitive solution to a problem.

I think the perfect solution would be to have df.groupby().ewm().mean() working as this seems most intuitive as a a step onward from df.groupby().rolling().mean().

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-45-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 pandas: 0.19.0 nose: 1.3.7 pip: 9.0.1 setuptools: 27.3.0 Cython: 0.24.1 numpy: 1.11.2 scipy: 0.18.1 statsmodels: 0.6.1 xarray: None IPython: 5.0.0 sphinx: 1.4.8 patsy: 0.4.1 dateutil: 2.5.3 pytz: 2016.7 blosc: None bottleneck: None tables: 3.3.0 numexpr: 2.6.1 matplotlib: 1.5.3 openpyxl: 2.4.0 xlrd: None xlwt: None xlsxwriter: 0.9.3 lxml: 3.5.0 bs4: 4.4.1 html5lib: 0.999 httplib2: 0.9.1 apiclient: None sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.8 boto: 2.42.0 pandas_datareader: None

Comment From: max-sixty

I haven't gone into detail, but have you tried data.groupby([data.index.hour]).apply(lambda x: x.ewm(alpha=0.84).mean())? That's the translation from old to new

Comment From: jreback

well the warning is pretty clear

In [11]: pd.options.display.max_rows=12

In [12]: df = pd.DataFrame({'A' : np.random.randn(100)},index=pd.date_range('20130101',periods=100,freq='H'))

In [13]: df
Out[13]: 
                            A
2013-01-01 00:00:00  0.531136
2013-01-01 01:00:00 -1.586308
2013-01-01 02:00:00  0.956570
2013-01-01 03:00:00 -1.314899
2013-01-01 04:00:00  1.225789
2013-01-01 05:00:00  0.152171
...                       ...
2013-01-04 22:00:00 -1.390932
2013-01-04 23:00:00  0.127784
2013-01-05 00:00:00  0.276486
2013-01-05 01:00:00  2.265782
2013-01-05 02:00:00  0.590485
2013-01-05 03:00:00 -0.781056

[100 rows x 1 columns]

In [14]: df.groupby([df.index.hour]).apply(lambda x: pd.ewma(x, alpha=0.84))
/Users/jreback/miniconda3/envs/main/bin/ipython:1: FutureWarning: pd.ewm_mean is deprecated for DataFrame and will be removed in a future version, replace with 
        DataFrame.ewm(alpha=0.84,min_periods=0,ignore_na=False,adjust=True).mean()
  #!/bin/bash /Users/jreback/miniconda3/envs/main/bin/python.app
Out[14]: 
                            A
2013-01-01 00:00:00  0.531136
2013-01-01 01:00:00 -1.586308
2013-01-01 02:00:00  0.956570
2013-01-01 03:00:00 -1.314899
2013-01-01 04:00:00  1.225789
2013-01-01 05:00:00  0.152171
...                       ...
2013-01-04 22:00:00 -1.061139
2013-01-04 23:00:00 -0.058194
2013-01-05 00:00:00  0.226075
2013-01-05 01:00:00  1.824656
2013-01-05 02:00:00  0.233793
2013-01-05 03:00:00 -0.422649

[100 rows x 1 columns]

In [15]: df.groupby([df.index.hour]).apply(lambda x: x.ewm(alpha=0.84).mean())
Out[15]: 
                            A
2013-01-01 00:00:00  0.531136
2013-01-01 01:00:00 -1.586308
2013-01-01 02:00:00  0.956570
2013-01-01 03:00:00 -1.314899
2013-01-01 04:00:00  1.225789
2013-01-01 05:00:00  0.152171
...                       ...
2013-01-04 22:00:00 -1.061139
2013-01-04 23:00:00 -0.058194
2013-01-05 00:00:00  0.226075
2013-01-05 01:00:00  1.824656
2013-01-05 02:00:00  0.233793
2013-01-05 03:00:00 -0.422649

[100 rows x 1 columns]