Pandas version checks
-
[x] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
-
[x] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
pd.to_datetime("2024-53-1", format="%G-%V-%u")
Issue Description
When using python format codes to resolve an isoweek notation ("%G-%V-%u") back to an iso date, to_datetime() incorrectly works with non-existing weeks. In the example below the third call should not return a valid date, as no week 53 exists in the isoclander year 2024 (the first lines are non-isoclandar, the last line is the first week in isocalendar 2025, which are all correct).
This behavior can be seen in others years with 52 isoweeks as well, e.g. pd.to_datetime("2023-53-1", format="%G-%V-%u") also returns the date of the first isoweek of 2024.
The python standard library correctly raises an error when the same thing is tried:
Expected Behavior
Raise an error for misformatted date string.
Installed Versions
Comment From: rhshadrach
Thanks for the report! Agreed this should raise, it looks like we can add the code
if iso_week == 53:
now = date.fromordinal(date(iso_year, 1, 1).toordinal() + ordinal - iso_weekday)
jan_4th = date(iso_year+1, 1, 4)
if (jan_4th - now).days < 7:
raise ValueError("...")
to _libs.tslibs.strptime._calc_julian_from_V
. But this is out of my element, need to make sure there isn't an off-by-1 error. Perhaps there is a better place to perform this check as well. Further investigations and PRs to fix are welcome!