This is a follow-up to #26242, for the following points: - [ ] Add test in test_merge_asof.py that checks that merge_asof() raises an exception if called on unordered categories for a merge key. - [ ] Remove docstrings in tests in test_merge_asof.py - [ ] Add test to check that merge_asof works when one of the merge keys is an ordered categorical - [ ] Raise MergeError sooner (rather than the ValueError that happens later) when asked to do a merge and one of the keys is an unordered categorical. - [ ] use repr(x.dtype) instead of x.dtype as input to .format() in exception messages everywhere in merge.py

Comment From: chrish42

There's a fifth potential cleanup. Currently, most of the tests use the "ticker data" example with a merge_asof(trades, quotes, on='time', by='ticker'), but with the "ticker" column left as object dtype. This leaves on the table a large number of tests that could exercise the by=[category dtype] code paths.

I see potentially two options here: 1. Just convert the "ticker" column to be a categorical. Pro: one-line change. Con: many fewer tests for the by=[object dtype] case. 2. Refactor the ticker data to be a pytest parametrized fixture, and then parametrize that fixture to generate both "object" and "category" variants for the "trades" and "quotes" dataframes. Pro: both cases now tested. Con: bigger change.

What do you think? If option 2 sounds good, now would be a good time for me to do it at least, given that I've been looking at that code lately (because I've been doing asof merges on dataframes that contain a lot of categorical data).

Comment From: chrish42

Maybe this would also be a good first step to then work on #16454?

Comment From: gfyoung

I like option 2, as it's more comprehensive.

Comment From: chrish42

@gfyoung Me too, especially since I'm interested in having merge_asof() working better with categorical data. Just looking for some approval that this direction is good with the Pandas developers before reworking a bunch of the tests there., and also that it's okay to use the more advanced pytest features in tests for Pandas.

Comment From: WillAyd

@chrish42 I'd say go for it! We use purest parametrized fixtures in other places as well so nothing out of the ordinary

Comment From: mroeschke

I think this has been cleaned up over the years so closing. Can open individual tickets for remaining issues is something isnt addressed