Pandas rolling().apply() skips window if it contains NaN

Code Sample

    import pandas as pd
    import numpy as np

    iinput = np.arange(10.0)
    iinput[5] = np.nan
    x = pd.Series(iinput).rolling(3).apply(lambda x: 1.0).tolist()
    print(x)

Expected Output and Problem Description

One would expect that the above code would print the list:

[nan, nan, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]

Instead, we get the following:

[nan, nan, 1.0, 1.0, 1.0, nan, nan, nan, 1.0, 1.0]

It seems that any time the input to lambda contains nan, then nan is returned automatically. This is problematic, because it is not possible to apply a custom rolling function to a series containing nans. Doing so will return a result riddled with more nans.

Additionally, this behavior exists exclusively for rolling(). Running the above code with expanding() works just fine.

Output of `pd.show_versions()`

[paste the output of ``pd.show_versions()`` here below this line] INSTALLED VERSIONS ------------------ commit: None python: 3.5.4.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 61 Stepping 4, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None pandas: 0.20.3 pytest: None pip: 9.0.1 setuptools: 36.4.0 Cython: None numpy: 1.13.1 scipy: None xarray: None IPython: None sphinx: None patsy: None dateutil: 2.6.1 pytz: 2017.2 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: None openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None sqlalchemy: None pymysql: None psycopg2: None jinja2: None s3fs: None pandas_gbq: None pandas_datareader: None

Comment From: jreback

min_periods controls whether the window is skipped or not. it defaults to the window size.

In [7]: iinput = np.arange(10.0)
   ...: iinput[5] = np.nan
   ...: iinput = pd.Series(iinput)
   ...: x = iinput.rolling(3, min_periods=1).apply(lambda x: 1.0)
   ...: x
   ...: 
Out[7]: 
0    1.0
1    1.0
2    1.0
3    1.0
4    1.0
5    1.0
6    1.0
7    1.0
8    1.0
9    1.0
dtype: float64

In [8]: iinput = np.arange(10.0)
   ...: iinput[5] = np.nan
   ...: iinput = pd.Series(iinput)
   ...: x = iinput.rolling(3, min_periods=3).apply(lambda x: 1.0)
   ...: x
   ...: 
Out[8]: 
0    NaN
1    NaN
2    1.0
3    1.0
4    1.0
5    NaN
6    NaN
7    NaN
8    1.0
9    1.0
dtype: float64

Comment From: Anisalexvl

Nevertheless, if we initialize next array array([ 0., 1., 2., 3., 4., nan, nan, nan, 8., 9.]) and implement next code

def f(x):
    print(x)
    return 1.

x = pd.Series(iinput).rolling(3, min_periods=1).apply(f).tolist()
print('Rolling result: ', x)

gives output

[ 0.]
[ 0.  1.]
[ 0.  1.  2.]
[ 1.  2.  3.]
[ 2.  3.  4.]
[  3.   4.  nan]
[  4.  nan  nan]
[ nan  nan   8.]
[ nan   8.   9.]
Rolling result:  [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, nan, 1.0, 1.0]

which means that function was not applied to values, that fully filled with nans. Expected that function should apply to every value and just want to agree with @scoliann:

It seems that any time the input to lambda contains nan, then nan is returned automatically. This is problematic, because it is not possible to apply a custom rolling function to a series containing nans.

Pandas rolling().apply() skips window if it contains NaN

Code Sample

Expected Output and Problem Description

Output of pd.show_versions()

Output of `pd.show_versions()`