Code Sample, a copy-pastable example if possible
df = pd.DataFrame({'a': [None, 2, 3]})
df.rolling(3)['a'].sum()
df.rolling(3)['a'].mean()
The result gives all NaNs (even for the 3rd row):
Out[2]:
0 NaN
1 NaN
2 NaN
Name: a, dtype: float64
Problem description
Differently from DataFrameGroupBy aggregation functions, where NaNs are skipped by default (skipna=True), this is not the case for Rolling aggregation functions. If there is a NaN in the rolling Window, aggregation functions on the rolling Window will give NaN as result. There does not even exist the option skipna for aggregation functions on a Rolling window.
Expected Output
The expected output would be the one skipping NaNs:
In[2]: df.rolling(3)['a'].sum()
Out[2]:
0 NaN
1 NaN
2 5.0
Name: a, dtype: float64
Comment From: jreback
This is min_periods
, part of the .rolling
doc-string
min_periods : int, default None
Minimum number of observations in window required to have a value
(otherwise result is NA). For a window that is specified by an offset,
this will default to 1.
In [12]: df.rolling(3)['a'].sum()
Out[12]:
0 NaN
1 NaN
2 NaN
Name: a, dtype: float64
In [13]: df.rolling(3, min_periods=1)['a'].sum()
Out[13]:
0 NaN
1 2.0
2 5.0
Name: a, dtype: float64