pd.date_range() handles closed intervals correctly for daily time series but incorrectly for monthly.

In this example, notice how freq='D' gives 4 days while freq='M' gives 3 months, dropping the final 2017-07-31:

pd.date_range(start='2017-01-04', end='2017-01-07', freq='D', closed=None)
# DatetimeIndex(['2017-01-04', '2017-01-05', '2017-01-06', '2017-01-07'], dtype='datetime64[ns]', freq='D')

pd.date_range(start='2017-04', end='2017-07', freq='M', closed=None)
# DatetimeIndex(['2017-04-30', '2017-05-31', '2017-06-30'], dtype='datetime64[ns]', freq='M')

This is with pandas 0.19.2. Here is my output for pd.show_versions():

pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.0.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-71-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.11.3
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.2.2
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 2.0.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.5
boto: None
pandas_datareader: None

Comment From: jreback

This is what you input

In [2]: freq = pd.offsets.MonthEnd()

In [3]: freq.rollback(Timestamp('2017-07-01'))
Out[3]: Timestamp('2017-06-30 00:00:00')

So the end point is first snapped to a valid end-point.

In [4]: freq.rollback(Timestamp('2017-07-31'))
Out[4]: Timestamp('2017-07-31 00:00:00')

For Tick offsets (daily and less), this is fine. But for snapped offsets I guess this is not the intent.

@chris-b1 thoughts?

Comment From: jreback

this is a duplicate of https://github.com/pandas-dev/pandas/issues/6673 (though that is for a different freq). @stevesimmons if you want to have a look there as well.

Comment From: chris-b1

Not sure on this one, like you said in #6673, it's somewhat arbitrary. It seems part of the issue here is that pd.Timestamp('2017-07') == pd.Timestamp('2017-07-01'), which is of course correct, but in the context of a monthly frequency, I can see why someone would expect 2017-07-31.

Comment From: jorisvandenbossche

@jreback I think this is actually a different issue. #6673 is about the interplay between closed='left' and the fact that the and gets rolled back to the last valid end date for the 'B' frequency. This problem however does not occur for monthly frequency:

In [57]: pd.date_range(start='2017-04', end='2017-07', freq='M', closed=None)
Out[57]: DatetimeIndex(['2017-04-30', '2017-05-31', '2017-06-30'], dtype='datetime64[ns]', freq='M')

In [58]: pd.date_range(start='2017-04', end='2017-07', freq='M', closed='left')
Out[58]: DatetimeIndex(['2017-04-30', '2017-05-31', '2017-06-30'], dtype='datetime64[ns]', freq='M')

(both are the same)

As noted above, the tricky/surprising part in this issue is that pd.Timestamp('2017-07') == pd.Timestamp('2017-07-01'), while the M frequency wants the end of the month.