• [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [ ] I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

import pandas as pd
df = pd.DataFrame( {'timestamp': ['2021-03-28T00:37:28.952+01:00','2021-03-28T04:09:36.106+02:00'],'data': [1, 2]})
df = df.set_index('timestamp')
df.index = pd.to_datetime(df.index)
df.resample('H').size()

Issue Description

pandas.DataFrame.resample() fails when index crosses a daylight savings boundary. Error as below

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'

Expected Behavior

timestamp
2021-03-28 00:00:00+01:00    1
2021-03-28 01:00:00+01:00    0
2021-03-28 02:00:00+01:00    0
2021-03-28 03:00:00+02:00    0
2021-03-28 03:00:00+02:00    0
2021-03-28 04:00:00+02:00    1
Freq: H, dtype: int64

Installed Versions

INSTALLED VERSIONS ------------------ commit : 73c68257545b5f8530b7044f56647bd2db92e2ba python : 3.9.2.final.0 python-bits : 64 OS : Linux OS-release : 4.19.0-17-cloud-amd64 Version : #1 SMP Debian 4.19.194-2 (2021-06-21) machine : x86_64 processor : byteorder : little LC_ALL : None LANG : C LOCALE : en_US.UTF-8 pandas : 1.3.3 numpy : 1.21.2 pytz : 2021.1 dateutil : 2.8.2 pip : 21.2.4 setuptools : 58.0.4 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.0.1 IPython : 7.27.0 pandas_datareader: None bs4 : None bottleneck : None fsspec : None fastparquet : None gcsfs : None matplotlib : 3.4.3 numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : 5.0.0 pyxlsb : None s3fs : None scipy : 1.7.1 sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None numba : None

Comment From: joooeey

If you're okay with displaying your times in UTC instead of local time, you can do pd.to_datetime(df.index, utc=True).

Comment From: MarcoGorelli

thanks for your issue

the issue is that your index isn't a datetimeindex, as it contains mixed timezones (which isn't allowed, and in fact, IMO, should be deprecated - this issue is a perfect example of how misleading it is https://github.com/pandas-dev/pandas/issues/50887)

you should instead pass utc=True and then convert to your time zone - then, DST will be handled fine

In [33]: import pandas as pd
    ...: df = pd.DataFrame( {'timestamp': ['2021-03-28T00:37:28.952+01:00','2021-03-28T04:09:36.106+02:00'],'data': [1,
    ...: 2]})
    ...: df = df.set_index('timestamp')
    ...: df.index = pd.to_datetime(df.index, utc=True).tz_convert('Europe/Brussels')

In [34]: df.resample('H').size()
Out[34]:
timestamp
2021-03-28 00:00:00+01:00    1
2021-03-28 01:00:00+01:00    0
2021-03-28 03:00:00+02:00    0
2021-03-28 04:00:00+02:00    1
Freq: H, dtype: int64

closing then