Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
# script.py
import datetime as dt
import pandas as pd
import pytz
print(f"{pd.__version__=}")
print(f"{pytz.__version__=}\n")
tz = "Europe/Stockholm"
# creation of tz localized date
d = dt.datetime(2023,4,1).replace(tzinfo=pytz.timezone(tz))
print(f"{d=}, {str(d)=}\n")
print(f"{d.tzinfo=}, {str(d.tzinfo)=}\n")
# creation of tz localized date range
idx = pd.date_range(start=d, periods=10, freq='H')
# idx additional tz parameter
idx2 = pd.date_range(start=d, periods=10, freq='H', tz=tz)
val = range(len(idx))
print(f"{idx=}\n")
print(f"{idx2=}\n")
print(f"idx == idx2", idx==idx2, '\n')
Output of script.py:
pd.__version__='2.0.0'
pytz.__version__='2023.3'
d=datetime.datetime(2023, 4, 1, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' LMT+0:53:00 STD>), str(d)='2023-04-01 00:00:00+00:53'
d.tzinfo=<DstTzInfo 'Europe/Stockholm' LMT+0:53:00 STD>, str(d.tzinfo)='Europe/Stockholm'
idx=DatetimeIndex(['2023-04-01 01:07:00+02:00', '2023-04-01 02:07:00+02:00',
'2023-04-01 03:07:00+02:00', '2023-04-01 04:07:00+02:00',
'2023-04-01 05:07:00+02:00', '2023-04-01 06:07:00+02:00',
'2023-04-01 07:07:00+02:00', '2023-04-01 08:07:00+02:00',
'2023-04-01 09:07:00+02:00', '2023-04-01 10:07:00+02:00'],
dtype='datetime64[ns, Europe/Stockholm]', freq='H')
idx2=DatetimeIndex(['2023-04-01 01:07:00+02:00', '2023-04-01 02:07:00+02:00',
'2023-04-01 03:07:00+02:00', '2023-04-01 04:07:00+02:00',
'2023-04-01 05:07:00+02:00', '2023-04-01 06:07:00+02:00',
'2023-04-01 07:07:00+02:00', '2023-04-01 08:07:00+02:00',
'2023-04-01 09:07:00+02:00', '2023-04-01 10:07:00+02:00'],
dtype='datetime64[ns, Europe/Stockholm]', freq='H')
idx == idx2 [ True True True True True True True True True True]
Issue Description
If I create a date_range()
with tz aware datetime instead of time 2023-04-01 00:00:00+02:00
as a first idx
element I get 2023-04-01 01:07:00+02:00
.
From where this 01:07:00+02:00
showed up?
If date_range()
is created from tz naive datetime, and than later on tz_localize
-d, everything works as expected.
Expected Behavior
I believe that date_range()
shouldn't change the very first element of idx
, so it should be 2023-04-01 00:00:00+02:00
.
This shows correct result: pd.date_range(start=dt.datetime(2023,4,1), periods=10, freq='H', tz="Europe/Stockholm")
(start is not tz aware, but tz is specified)
DatetimeIndex(['2023-04-01 00:00:00+02:00', '2023-04-01 01:00:00+02:00',
'2023-04-01 02:00:00+02:00', '2023-04-01 03:00:00+02:00',
'2023-04-01 04:00:00+02:00', '2023-04-01 05:00:00+02:00',
'2023-04-01 06:00:00+02:00', '2023-04-01 07:00:00+02:00',
'2023-04-01 08:00:00+02:00', '2023-04-01 09:00:00+02:00'],
dtype='datetime64[ns, Europe/Stockholm]', freq='H')
Installed Versions
INSTALLED VERSIONS
commit : 478d340667831908b5b4bf09a2787a11a14560c9 python : 3.8.3.final.0 python-bits : 64 OS : Linux OS-release : 3.10.0-1127.el7.x86_64 Version : #1 SMP Tue Mar 31 23:36:51 UTC 2020 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : C.UTF-8 LOCALE : None.None
pandas : 2.0.0 numpy : 1.24.2 pytz : 2023.3 dateutil : 2.8.2 setuptools : 61.2.0 pip : 21.2.4 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : None brotli : None fastparquet : None fsspec : None gcsfs : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : None snappy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None zstandard : None tzdata : 2023.3 qtpy : None pyqt5 : None
Comment From: mroeschke
d = dt.datetime(2023,4,1).replace(tzinfo=pytz.timezone(tz))
This is incorrect usage of pytz timezones. You'll need to use pytz.timezone(tz).localize(dt.datetime(...))
instead to get the correct date. Closing as a usage question
Comment From: zljubisic
@mroeschke You are right. If I use pytz as you have suggested, everything works as expected. Thank you very much, and best regards.
PS Stupid chatgpt4