Not sure if this can be trimmed much, but...
In [5]: timeit pandas.tseries.tools.dateutil_parse('20000104', datetime(1, 1, 1))
10000 loops, best of 3: 76.6 us per loop
Testing should be higher, at the parse_time_string
level
Time spent mostly in dateutil:
In [8]: timeit dateutil.parser.parse('20000104')
10000 loops, best of 3: 66.5 us per loop
Comment From: changhiskhan
Looks like a lot of time is being spent in the split method of the time lexer (dateutil 1.5)
Comment From: paulproteus
Given that, should the ticket be then moved to the dateutil project?
For those interested in addressing this issue, you should be proficient with the Python profiler. See my remarks on #2475 about how to use the profiler. You will probably end up submitting changes to dateutil, rather than the pandas repo itself, but one nice thing about that is that your work would benefit any user of dateutil.
Comment From: wesm
pandas will eventually need to take ownership of datetime string parsing as performance at the level we care about is unlikely to be a major priority for the dateutil developers.