Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [X] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
ts = pd.Timestamp(2017, 1, 31)
ts + pd.offsets.MonthEnd()

Issue Description

I expect that the answer is: Timestamp('2017-01-31 00:00:00') but I get: Timestamp('2017-02-28 00:00:00')

Expected Behavior

Timestamp('2017-01-31 00:00:00')

Installed Versions

INSTALLED VERSIONS ------------------ commit : 4bfe3d07b4858144c219b9346329027024102ab6 python : 3.9.12.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.19044 machine : AMD64 processor : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : Dutch_Netherlands.1252 pandas : 1.4.2 numpy : 1.21.5 pytz : 2021.3 dateutil : 2.8.2 pip : 21.2.4 setuptools : 61.2.0 Cython : 0.29.28 pytest : 7.1.1 hypothesis : None sphinx : 4.4.0 blosc : None feather : None xlsxwriter : 3.0.3 lxml.etree : 4.8.0 html5lib : None pymysql : None psycopg2 : None jinja2 : 2.11.3 IPython : 8.2.0 pandas_datareader: None bs4 : 4.11.1 bottleneck : 1.3.4 brotli : fastparquet : None fsspec : 2022.02.0 gcsfs : None markupsafe : 2.0.1 matplotlib : 3.5.1 numba : 0.55.1 numexpr : 2.8.1 odfpy : None openpyxl : 3.0.9 pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : 1.7.3 snappy : sqlalchemy : 1.4.32 tables : 3.6.1 tabulate : 0.8.9 xarray : 0.20.1 xlrd : 2.0.1 xlwt : None zstandard : None

Comment From: MarcoGorelli

Thanks for the report

MonthEnd goes to the next date which is an end of the month, so this seems correct

You probably want MonthEnd(0) to get the end of the current month

Might be good to add this as an example to the docs as it seems to be a common source of confusion https://stackoverflow.com/questions/37354105/find-the-end-of-the-month-of-a-pandas-dataframe-series

Comment From: ZDenizYStenhaug

Hi! :) I'd love to help with the documentation if this issue is still relevant. If so, could you please assign it to me? Thanks, have a good day.

Comment From: MarcoGorelli

thanks, go ahead! if you comment "take" the issue will be assigned to you, do reach out if you have questions

Comment From: ZDenizYStenhaug

take

Comment From: ChhaSahil

take

Comment From: jtimko16

Hello, I am looking for my very first contribution. Is there still something I can help will (I would probably need a little bit of guidance). Thanks

Comment From: MarcoGorelli

yup, pull requests welcome https://pandas.pydata.org/pandas-docs/stable/development/contributing.html

Comment From: jtimko16

take

Comment From: pjousma

Thanks,

I found a equivalent solution: ts.replace(day=1) + pd.offsets.MonthEnd(0) For me it is not logical to go to the next month when no parameters are given. For me it is logical that the default parameter should be '0' when no parameters are given by the user.

But thanks for the quick response

Op ma 26 sep. 2022 om 10:51 schreef Marco Edward Gorelli ***@***.***:

Thanks for the report

MonthEnd goes to the next date which is an end of the month, so this seems correct

You probably want MonthEnd(0) to get the end of the current month

Might be good to add this as an example to the docs as it seems to be a common source of confusion https://stackoverflow.com/questions/37354105/find-the-end-of-the-month-of-a-pandas-dataframe-series

— Reply to this email directly, view it on GitHub https://github.com/pandas-dev/pandas/issues/48779#issuecomment-1257707096, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3IXQLDZWOMAEFHTFZOVUQDWAFPZDANCNFSM6AAAAAAQVRXV34 . You are receiving this because you authored the thread.Message ID: @.***>

Comment From: jtimko16

Thanks, I found a equivalent solution: ts.replace(day=1) + pd.offsets.MonthEnd(0) For me it is not logical to go to the next month when no parameters are given. For me it is logical that the default parameter should be '0' when no parameters are given by the user. But thanks for the quick response Op ma 26 sep. 2022 om 10:51 schreef Marco Edward Gorelli < @.>: Thanks for the report MonthEnd goes to the next date which is an end of the month, so this seems correct You probably want MonthEnd(0) to get the end of the current month Might be good to add this as an example to the docs as it seems to be a common source of confusion https://stackoverflow.com/questions/37354105/find-the-end-of-the-month-of-a-pandas-dataframe-series — Reply to this email directly, view it on GitHub <#48779 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3IXQLDZWOMAEFHTFZOVUQDWAFPZDANCNFSM6AAAAAAQVRXV34 . You are receiving this because you authored the thread.Message ID: @.>

Hello, pjousma

Thanks for your comment. I've been thinking about it, and also agree, that the default n parameter should be zero. And also the same for pandas.tseries.offsets.YearEnd() function (it has the same issue).

Who can decide whether we should change parameter to zero or keep it as is?

Thanks, J.

Comment From: MarcoGorelli

this'd be a breaking change and would have to go through a deprecation cycle, not sure it'd be worth it. Would be more in favour of just clearly documenting the behaviour

Comment From: jtimko16

this'd be a breaking change and would have to go through a deprecation cycle, not sure it'd be worth it. Would be more in favour of just clearly documenting the behaviour

Ok, ok. I am going to update the documentation in the next few days. I'll let you know in case of any questions/issues. Thanks.

Comment From: jtimko16

Hi @MarcoGorelli , please may I ask you for a clarification.

Do I understand correctly, I should is update this page of documentation?: https://pandas.pydata.org/docs/reference/api/pandas.tseries.offsets.MonthEnd.html

Therefore I'll be modifying this part of code https://github.com/pandas-dev/pandas/blob/main/pandas/_libs/tslibs/offsets.pyx (class MonthEnd; starting from row 2746 - I am going to write more examples into the comment block below). Is it correct? Thanks

Pandas BUG: pd.offsets.MonthEnd() delevers wrong month

Comment From: MarcoGorelli

yeah looks fine

Comment From: jtimko16

I created a Pull Request https://github.com/pandas-dev/pandas/pull/49133

However as it is my very Pull Request, therefore for now, I marked it as a draft. Please can you look at it and give me some feedback? Thanks

Comment From: pjousma

"When parameter n=0, always offset to the end of month.

When parameter n=1, depend on the given date: a) Given date is on an anchor point (last day of the month) -> offset to the end of the following month.

b) Given date is not on an anchor point -> offset to the end of the same month."

For me the functionality of the parameter is not logical. My intepretation is that n the offset of the data used by this function. n=0: end of month (of used date) n=1: end of next month n=2: end of 2e next month n=-1: end of previous month

An intepration as mentioned "n=1 a if ... else b" is for me very unlogical.

Op zo 16 okt. 2022 om 22:54 schreef Jan Timko @.***>:

I created a Pull Request #49133 https://github.com/pandas-dev/pandas/pull/49133

However as it is my very Pull Request, therefore for now, I marked it as a draft. Please can you look at it and give me some feedback? Thanks

— Reply to this email directly, view it on GitHub https://github.com/pandas-dev/pandas/issues/48779#issuecomment-1280055330, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3IXQLBOJSCRDFHHADNJHNDWDRTOPANCNFSM6AAAAAAQVRXV34 . You are receiving this because you authored the thread.Message ID: @.***>

Comment From: jtimko16

@pjousma Yes, I agree that the current behavior is little bit confusing, and what you suggested is definitely more logical.

However, I believe that as quick immediate solution we can document current logic (if everything goes well we can close this PR in next several days) and in longer term we can open discussion about changing functionality of the 'n' parameter. Not sure about processes here in Panda, but it make take some time to approve an implement the change.

Comment From: MarcoGorelli

Yeah changing behaviour shouldn't be taken lightly - for a start, would there be a way to communicate the change to users in such a way as to not flood them with warnings every time they use MonthEnd?

But certainly it should be clearly documented with examples so thanks @honza26 for taking this on!

Comment From: jtimko16

Hello, unfortunately I am facing technical issues with setting up environment and will not be able to finish (my laptop is quite old and slow, therefore cannot install a lot of tools that will slow it done further)

PR in progress for anyone who want to take it from there: https://github.com/pandas-dev/pandas/pull/49133

If any question, feel free to reach out. And @MarcoGorelli thanks a lot for help and guidance with it (although I didn't manage to cross the finishing line).

Cheers, Jan

Comment From: MarcoGorelli

hey @honza26 - if you want to come to the next new contributors meeting, you could share you screen and we could resolve your technical issues with you?

Comment From: justin4480

take

Comment From: justin4480

Hi. I'm a long time user of pandas, but this is my first contribution. Can I update both the docstring and the docs?

I plan to: 1. make clear that normalize is an ignored parameter 2. clarify the unintuitive logic when MonthEnd n > 0 and added to a Timestamp not at monthend

normalize is an ignored parameter

docstring states (class) Docstring shows MonthEnd(n: int = ..., normalize: bool = ...), however normalize doesn't appear to do anything:

for date in pd.date_range(start='2000-01-01', end='2001-01-01'):
    for n in [-1, 0, 1, 2]:
        a = date + pd.offsets.MonthEnd(n=n, normalize=True)
        b = date + pd.offsets.MonthEnd(n=n, normalize=False)
        assert a == b

unintuitive logic

for n in range(-2, 3):
    print(f"\n{n=}")
    print(f"2000-05-01 -> {(pd.Timestamp('2000-05-01') + pd.offsets.MonthEnd(n=n)).strftime('%Y-%m-%d')}")
    print(f"2000-05-15 -> {(pd.Timestamp('2000-05-15') + pd.offsets.MonthEnd(n=n)).strftime('%Y-%m-%d')}")
    print(f"2000-05-31 -> {(pd.Timestamp('2000-05-31') + pd.offsets.MonthEnd(n=n)).strftime('%Y-%m-%d')}")
# n=-2
+ 2000-05-01 -> 2000-03-31
+ 2000-05-15 -> 2000-03-31
+ 2000-05-31 -> 2000-03-31

# n=-1
+ 2000-05-01 -> 2000-04-30
+ 2000-05-15 -> 2000-04-30
+ 2000-05-31 -> 2000-04-30

# n=0
+ 2000-05-01 -> 2000-05-31
+ 2000-05-15 -> 2000-05-31
+ 2000-05-31 -> 2000-05-31

# n=1
! 2000-05-01 -> 2000-05-31
! 2000-05-15 -> 2000-05-31
+ 2000-05-31 -> 2000-06-30

# n=2
! 2000-05-01 -> 2000-06-30
! 2000-05-15 -> 2000-06-30
+ 2000-05-31 -> 2000-07-31

I appreciate the breaking changes impact, but this really does feel like a bug to me as I would expect MonthEnd() should behave like excels =EOMONTH function.

Regardless, I will at least document as clearly as possible with examples, but it's going to be hard to explain away given results are not even consistent, i.e. n = -1 Vs n = +1 (as shown above).

Comment From: jtimko16

hey @honza26 - if you want to come to the next new contributors meeting, you could share you screen and we could resolve your technical issues with you?

Thanks for your offer. I will try to build the environment again, and potentially join the next contributors meeting.

Comment From: jtimko16

@justin4480 Thanks for taking this further. A good initiative!

A good explanation of general offset logic is here: (in case it helps) https://pandas.pydata.org/docs/user_guide/timeseries.html

Pandas BUG: pd.offsets.MonthEnd() delevers wrong month

Comment From: MarcoGorelli

That's a good reference, it's just not very easy to discover from the docstrings.

I'd be all for just adding a simple example to the docstrings of each of those offsets (MonthEnd, MonthBegin, WeekEnd)

Comment From: justin4480

@justin4480 Thanks for taking this further. A good initiative!

A good explanation of general offset logic is here: (in case it helps) https://pandas.pydata.org/docs/user_guide/timeseries.html

Pandas BUG: pd.offsets.MonthEnd() delevers wrong month

@honza26 Thanks for sending this over. This was very helpful in understanding the logic & design intent!

However, one doubt for me remains regarding the normalize parameter. The docs claim: Pandas BUG: pd.offsets.MonthEnd() delevers wrong month source: https://pandas.pydata.org/docs/user_guide/timeseries.html#parametric-offsets

However the below runs perfectly fine.

for dt in pd.date_range(start='2000-01-01', end='2001-01-01'):
    for n in range(-2, 3):
        a = dt + pd.offsets.MonthEnd(n=n, normalize=True)
        b = dt + pd.offsets.MonthEnd(n=n, normalize=False)
        assert a == b
        c = dt - pd.offsets.MonthEnd(n=n, normalize=True)
        d = dt - pd.offsets.MonthEnd(n=n, normalize=False)
        assert c == d

Comment From: jtimko16

@justin4480 If normalize is True, then datetime is normalized to midnight. If you dont have any 'time' part of in your date/datetime variables, it makes no difference.

My examples to show functionality:

`dt_now = datetime.datetime.now() print(f'Datetime now: {dt_now}\n')

a = dt_now + pd.offsets.MonthEnd(n=1, normalize=True) b = dt_now + pd.offsets.MonthEnd(n=1, normalize=False)

print(f'Offset to MonthEnd; normalize = True: {a}') print(f'Offset to MonthEnd; normalize = False: {b}') `

And output is:

`Datetime now: 2022-10-30 20:42:39.526462

Offset to MonthEnd; normalize = True: 2022-10-31 00:00:00 Offset to MonthEnd; normalize = False: 2022-10-31 20:42:39.526462`

Comment From: justin4480

Sorry I was determined to do this, but have spent far too long trying to get my environment setup. I tried both the recommended docker & mamba dev environment setup, but just couldn't get the pytest suite to run without tones of errors (before making any code edits). I will revisit when I have more time as I do want to contribute, but have unassigned this particular issue for now.

Comment From: zemnly

Hi, I'd like to take a shot at this. Gonna be my first contribution here. I've been using Pandas on and off for my job since the past 3 years (:

Comment From: zemnly

take

Comment From: jtimko16

@zemnly Nice :) Good luck!!

Comment From: natmokval

Hi, I am working on this issue as well.