Code Sample, a copy-pastable example if possible
dft = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]},
index=pd.date_range('20130101 09:00:00', periods=5, freq='s'))
dft['another'] = ['a', 'a', 'c', 'd', 'e']
dft.set_index([dft.index, 'another'])
dft.rolling('2s')
Expected Output
Expected to be able to use a time-series aware rolling window, instead got ValueError: window must be an integer
output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.4.3.final.0
python-bits: 64
OS: Darwin
OS-release: 15.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.19.0rc1
nose: None
pip: 6.0.8
setuptools: 27.2.0
Cython: None
numpy: 1.11.1
scipy: None
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
boto: None
pandas_datareader: None
Comment From: jreback
pls make sure that the code you are showing is from the version where you are running. you might have from another session. IOW, you are picking up a different version from which you are showing pd.show_versions()
In [1]: pd.__version__
Out[1]: '0.19.0rc1+0.g497a3bc.dirty'
In [2]: dft = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]},
...: index=pd.date_range('20130101 09:00:00', periods=5, freq='s'))
...: dft['another'] = ['a', 'a', 'c', 'd', 'e']
...: dft.set_index([dft.index, 'another'])
...: dft.rolling('2s')
...:
Out[2]: Rolling [window=2000000000,min_periods=1,center=False,win_type=freq,axis=0]
In [3]: dft.rolling('2s').B.sum()
Out[3]:
2013-01-01 09:00:00 0.0
2013-01-01 09:00:01 1.0
2013-01-01 09:00:02 3.0
2013-01-01 09:00:03 2.0
2013-01-01 09:00:04 4.0
Freq: S, Name: B, dtype: float64
Comment From: nfcampos
@jreback I'm indeed running this in the version I quoted above. This was run in a fresh virtual environment created just for this.
However, I did forget to write inplace=True
in the call to set_index
in the snippet above, which is why you were unable to reproduce it.
Please try again with this snippet:
dft = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]},
index=pd.date_range('20130101 09:00:00', periods=5, freq='s'))
dft['another'] = ['a', 'a', 'c', 'd', 'e']
dft.set_index([dft.index, 'another'], inplace=True)
dft.rolling('2s')
Comment From: jreback
of course that doesn't work, you have a multi-index.
.rolling
does not support a level
arg, we recently added to .resample
here you can create an issue to add it to .rolling
if you'd like
Comment From: nfcampos
@jreback thanks for your reply. The docs aren't at all clear that time-series support for rolling applies only if you have a single index. ~~By the way, the PR you link to actually mentions an on
keyword for resample
that can be used to select the level of a multi-index. So if this were supported for rolling, the on
argument could be used for that as well, instead of a separate level
argument.~~ (edit: nevermind, I didn't read that right)
Comment From: jreback
well the doc-string is pretty clear. as I said you can put forth an issue for an enhancement.