Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
-
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
x = pd.to_datetime('87156549591102612381000001219H5',infer_datetime_format=True, errors='coerce')
Issue Description
Function should return NaT in case of problem, not exception decimal.DivisionImpossible:
InvalidOperation: [
Problem present also in former versions of pandas.
Expected Behavior
x = pd.NaT
Installed Versions
Comment From: MarcoGorelli
thanks @gkep for the report
this comes from dateutil:
In [4]: from dateutil.parser import parse as du_parse
In [5]: du_parse('87156549591102612381000001219H5')
---------------------------------------------------------------------------
InvalidOperation Traceback (most recent call last)
Cell In[5], line 1
----> 1 du_parse('87156549591102612381000001219H5')
File ~/pandas-dev/.311venv/lib/python3.11/site-packages/dateutil/parser/_parser.py:1368, in parse(timestr, parserinfo, **kwargs)
1366 return parser(parserinfo).parse(timestr, **kwargs)
1367 else:
-> 1368 return DEFAULTPARSER.parse(timestr, **kwargs)
File ~/pandas-dev/.311venv/lib/python3.11/site-packages/dateutil/parser/_parser.py:640, in parser.parse(self, timestr, default, ignoretz, tzinfos, **kwargs)
636 if default is None:
637 default = datetime.datetime.now().replace(hour=0, minute=0,
638 second=0, microsecond=0)
--> 640 res, skipped_tokens = self._parse(timestr, **kwargs)
642 if res is None:
643 raise ParserError("Unknown string format: %s", timestr)
File ~/pandas-dev/.311venv/lib/python3.11/site-packages/dateutil/parser/_parser.py:740, in parser._parse(self, timestr, dayfirst, yearfirst, fuzzy, fuzzy_with_tokens)
736 value = None
738 if value is not None:
739 # Numeric token
--> 740 i = self._parse_numeric_token(l, i, info, ymd, res, fuzzy)
742 # Check weekday
743 elif info.weekday(l[i]) is not None:
File ~/pandas-dev/.311venv/lib/python3.11/site-packages/dateutil/parser/_parser.py:936, in parser._parse_numeric_token(self, tokens, idx, info, ymd, res, fuzzy)
932 (idx, hms) = self._parse_hms(idx, tokens, info, hms_idx)
933 if hms is not None:
934 # TODO: checking that hour/minute/second are not
935 # already set?
--> 936 self._assign_hms(res, value_repr, hms)
938 elif idx + 2 < len_l and tokens[idx + 1] == ':':
939 # HH:MM[:SS[.ss]]
940 res.hour = int(value)
File ~/pandas-dev/.311venv/lib/python3.11/site-packages/dateutil/parser/_parser.py:1047, in parser._assign_hms(self, res, value_repr, hms)
1044 if hms == 0:
1045 # Hour
1046 res.hour = int(value)
-> 1047 if value % 1:
1048 res.minute = int(60*(value % 1))
1050 elif hms == 1:
InvalidOperation: [<class 'decimal.DivisionImpossible'>]
I'd suggest reporting there
As for what to do on the pandas side, this should probably be caught, we just need to add this exception to
https://github.com/pandas-dev/pandas/blob/5645847c042de58d6ea4f4c2fe0e28457f0e1ab0/pandas/_libs/tslibs/parsing.pyx#L892-L896
, and add a test case for this in
https://github.com/pandas-dev/pandas/blob/a23f101ed06823d7e4a9be7450af2e99cb59d396/pandas/tests/tslibs/test_parsing.py#L238-L254
Interested in submitting a PR?
Comment From: kcaashish
take