Currently, the Categorical nan value is hard-coded to np.nan. I propose making the CategoricalDtype.na_value take its value from CategoricalDtype.categories.dtype.na_value if the categories is an ExtensionArray else fall back to np.nan.

There are various code parts in Categorical that presume that the nan sentinel value is np.nan. Those will have to be changed to use Categorical.dtype.na_value instead.

Comment From: rhshadrach

Duplicate of #50711 (in particular https://github.com/pandas-dev/pandas/issues/50711#issuecomment-1422910080) I think.

Comment From: topper-123

Yes, it's the same idea is mentioned by @jorisvandenbossche. I think that's also the final conclusion for #50711 (i.e to follow the nan-behaviour of the underlying categories (np.nan for numpy arrays, pd.NA for extensionarrays).

I'm ok with closing this to avoid the duplicate.