Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
t = pd.Timestamp('2014-01-02')
off = pd.tseries.offsets.QuarterBegin(n=2, startingMonth=10) # 2QS-OCT
off.rollback(t)
# -> Timestamp('2014-01-01 00:00:00')
Issue Description
I want to use an offset of half-years starting in october/april. In my xarray calls, I thus use "2QS-OCT", which translates to QuaterBegin(n=2, startingMonth=10)
.
However, the rolling back and forward do not seem to understand the multiple as my example shows. The 1st of january is 3 months after the 1st of october, thus I didn't expected it to be a valid date using that offset.
Expected Behavior
``` off.rollback(t)
-> Timestamp('2013-10-01 00:00:00')
I expected the rollback and rollforward function to only return dates that fall on this offset so either the 1st of october or the 1st of april.
### Installed Versions
<details>
INSTALLED VERSIONS
------------------
commit : e8093ba372f9adfe79439d90fe74b0b5b6dea9d6
python : 3.10.5.final.0
python-bits : 64
OS : Linux
OS-release : 5.19.4-200.fc36.x86_64
Version : #1 SMP PREEMPT_DYNAMIC Thu Aug 25 17:42:04 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : fr_CA.UTF-8
LOCALE : fr_CA.UTF-8
pandas : 1.4.3
numpy : 1.22.4
pytz : 2022.1
dateutil : 2.8.2
setuptools : 63.1.0
pip : 22.1.2
Cython : None
pytest : 7.1.2
hypothesis : None
sphinx : 5.0.2
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.4.0
pandas_datareader: None
bs4 : 4.11.1
bottleneck : 1.3.5
brotli :
fastparquet : None
fsspec : 2022.5.0
gcsfs : None
markupsafe : 2.1.1
matplotlib : 3.5.2
numba : 0.55.2
numexpr : None
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.8.1
snappy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : 2022.6.0
xlrd : None
xlwt : None
zstandard : None
</details>
**Comment From: dicristina**
Use the arithmetic operators along with the correct multiple instead of `rollback`:
``` python
import pandas as pd
t = pd.Timestamp('2014-01-02')
off = pd.offsets.QuarterBegin(n=2, startingMonth=10)
off2 = pd.offsets.QuarterBegin(n=1, startingMonth=10)
t - off, t - 2 * off2
rollback
and rollforward
always use a multiple of 1. This is the intended behavior.
Comment From: aulemahal
Thanks @dicristina. My original problem was thus not related to this, so I have opened another issue.
Comment From: aulemahal
I just realized this doesn't work when t
falls on the non-multiple offset.
import pandas as pd
t = pd.Timestamp('2014-01-01')
off = pd.offsets.QuarterBegin(n=2, startingMonth=10)
off2 = pd.offsets.QuarterBegin(n=1, startingMonth=10)
t - off, t - 2 * off2
Gives:
(Timestamp('2013-07-01 00:00:00'), Timestamp('2013-07-01 00:00:00'))
But the 1st of July is not a date that corresponds to QuarterBegin(n=2, startingMonth=10)
. I think the second one with off2
is expected, but I don't think the first one should be.
Related: off.is_on_offset(t)
is True, but I thought it shouldn't ?
Comment From: dicristina
Setting n=2
affects the result of arithmetic operations but does not change the anchor points, as you found by doing off.is_on_offset(t)
. The user guide says the following about arithmetic between a timestamp and an anchored offset:
When n is not 0, if the given date is not on an anchor point, it snapped to the next(previous) anchor point, and moved |n|-1 additional steps forwards or backwards. ... If the given date is on an anchor point, it is moved |n| points forwards or backwards.
In my example (with 2014-01-02) we first snap back to 2013-01-01 (a quarter anchor) then go one quarter to the past and end up on 2013-10-01. In your last example we are already in a quarter anchor so we go two quarters into the past to 2013-07-01.
What you really need is a semester offset, which pandas does not have. Using a quarter offset with n=2
doesn't quite do what you are looking for. Two and four month offsets are also missing.
Comment From: aulemahal
Thanks for the clarification @dicristina . I'll close this as it is reiterated that pandas is acting as expected.
I'd argue that it seems logical to me that a 2Q
offset would be identical a semester offset, but I understand this is not the vision of pandas. Surely, there are uses cases I don't see, where the difference makes sense.
I'll look into proposing a half-year offset implementation.