Code Sample

result = pd.Series(1., pd.date_range('20170101','20181231',freq='D')).resample('B').last().to_period()

print result.index[0]
# Period('2016-12-30', 'B')

pd.Period('20170101', freq='B')
# Period('2017-01-02', 'B')

Problem description

when i have daily data that spans weekends (e.g., 1/1/2017 which is a Sunday) and try to get the last available value for a BusinessDay, the data falls back to the Friday period. This is inconsistent with the timestamp of Sunday belonging to the Monday BusinessDay period (e.g., pd.Period('20170101', 'B') goes to 1/2/2017).

Expected Output

I would expect that the result.index[0] above would return Period('2017-01-02', 'B')

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.12.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-36-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None pandas: 0.19.2 nose: 1.3.7 pip: 9.0.1 setuptools: 34.3.0 Cython: 0.24 numpy: 1.12.0 scipy: 0.18.1 statsmodels: 0.6.1 xarray: 0.9.1 IPython: 4.1.1 sphinx: None patsy: 0.4.1 dateutil: 2.6.0 pytz: 2016.10 blosc: None bottleneck: 1.2.0 tables: None numexpr: 2.5 matplotlib: 1.5.1 openpyxl: 2.3.3 xlrd: 0.9.4 xlwt: 1.0.0 xlsxwriter: None lxml: 3.5.0 bs4: 4.5.1 html5lib: None httplib2: 0.9.2 apiclient: None sqlalchemy: 1.0.12 pymysql: None psycopg2: None jinja2: 2.8 boto: 2.45.0 pandas_datareader: None

Comment From: erbian

possibly related to https://github.com/pandas-dev/pandas/issues/10575

Comment From: jreback

duplicate of this issue: https://github.com/pandas-dev/pandas/issues/11123

Comment From: jreback

this is basically a convention. Though I guess it could be regarded as a bug as well. See the discussion on #11123. (and the xref issue you pointed). If you have some thoughts, pls share.

Comment From: erbian

@jreback I understand that the choice of including Fri-Sat-Sun in Friday period is convention (though as others have pointed out, probably not ideal for time-series analysis given look-forward issues). I think it is then inconsistent that creating a period from a timestamp returns a period for which the timestamp is not included. for example:

 pd.Period(pd.datetime(2017,1,1), 'B').start_time
#  Timestamp('2017-01-02 00:00:00')

does the issue i am raising make sense?

Comment From: jreback

In [3]: pd.datetime(2017,1,1) + pd.offsets.BusinessDay()
Out[3]: Timestamp('2017-01-02 00:00:00')

this is just convention (and documented).

In the other issue I suggested a new frequency as B is essentially look-ahead, maybe a look-behind one.

this looks reasonable actually. I think using .to_period() itself is the issue.

In [27]: r = pd.Series(1., pd.date_range('20170101','20170131',freq='D')).resample('B').asfreq()

In [28]: pd.concat([r, r.index.to_series().dt.weekday_name], axis=1)
Out[28]: 
              0          1
2016-12-30  NaN     Friday
2017-01-02  1.0     Monday
2017-01-03  1.0    Tuesday
2017-01-04  1.0  Wednesday
2017-01-05  1.0   Thursday
2017-01-06  1.0     Friday
2017-01-09  1.0     Monday
2017-01-10  1.0    Tuesday
2017-01-11  1.0  Wednesday
2017-01-12  1.0   Thursday
2017-01-13  1.0     Friday
2017-01-16  1.0     Monday
2017-01-17  1.0    Tuesday
2017-01-18  1.0  Wednesday
2017-01-19  1.0   Thursday
2017-01-20  1.0     Friday
2017-01-23  1.0     Monday
2017-01-24  1.0    Tuesday
2017-01-25  1.0  Wednesday
2017-01-26  1.0   Thursday
2017-01-27  1.0     Friday
2017-01-30  1.0     Monday
2017-01-31  1.0    Tuesday

Comment From: jreback

cc @chris-b1