Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np

s = pd.Series([1, 2, 3, np.nan, 5], index=pd.date_range('2017-01-01', '2017-01-05'))
print(s.rolling('3d', min_periods=1).apply(lambda x: 42))

Problem description

Output of the above code sample:

2017-01-01    42.0
2017-01-02    42.0
2017-01-03    42.0
2017-01-04     NaN
2017-01-05    42.0

It seems that the user function did not get applied to the window corresponding to the original NaN row, resulting in NaN as the result for that row. Why is that? The more reasonable output would be all 42 because by the min_periods constraint, all of the windows are valid.

Compare this to the equivalent fixed window version:

s = pd.Series([1, 2, 3, np.nan, 5])
print(s.rolling(3, min_periods=1).apply(lambda x: 42))

which gives the following output:

0    42.0
1    42.0
2    42.0
3    42.0
4    42.0

Is this inconsistency between fixed and variable size window a desired behavior?

Expected Output

2017-01-01    42.0
2017-01-02    42.0
2017-01-03    42.0
2017-01-04    42.0
2017-01-05    42.0

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.13.final.0 python-bits: 64 OS: Linux OS-release: 4.11.9-1-ARCH machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None pandas: 0.21.0.dev+316.gf2b0bdc9b pytest: None pip: 9.0.1 setuptools: 36.2.5 Cython: 0.26 numpy: 1.13.1 scipy: None pyarrow: None xarray: None IPython: 5.4.1 sphinx: None patsy: None dateutil: 2.6.1 pytz: 2017.2 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: None openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None sqlalchemy: None pymysql: None psycopg2: None jinja2: None s3fs: None pandas_gbq: None pandas_datareader: None

Comment From: gfyoung

@YS-L : Thanks for reporting this! That does seem a little odd to me. I don't see why we shouldn't have consistency here. This behavior seems counter-intuitive.

Comment From: YS-L

Looks like this is a duplicate of https://github.com/pandas-dev/pandas/issues/15305, which had been fixed recently on master!

Comment From: gfyoung

Just to make sure, did you try pulling master to confirm?

Comment From: YS-L

Yes, pulled master to confirm.

Thought I found a fix after some digging into window.pyx, just to discover that it had been applied there on master. Note to self: should always test on latest master.

Comment From: gfyoung

@YS-L : No worries! We don't expect everyone to pull master and build pandas from source, though since you're an Arch user, I guess I would expect that from you :wink:

But actually, thanks for taking the time to do this. We're happy to see that we anticipated your problem :smile: