Code Sample, a copy-pastable example if possible
>>> ndt = np.dtype(object)
>>> pdt = pd.api.types.CategoricalDtype(categories=['German', 'English', 'French'])
>>> pdt == ndt
False # ok
>>> ndt == pdt
TypeError: data type not understood
Problem description
The dtypes are not always comparable, The same issue is with IntervalDtype and if the numpy types are oher kinds (int, float, dates etc.).
The issue may be a numpy issue and not a pandas issue, but I raise it here for discussion first, and can later file an issue at the numpy repository.
Expected Output
Expected was False
.
Output of pd.show_versions()
Comment From: jreback
you can see the original issue that I filed to numpy in 2014, https://github.com/numpy/numpy/issues/5329 as well as #8814
this is an understood and numpy issue that won't be fixed :<
Comment From: jschendel
Note that pandas.core.dtypes.common.is_dtype_equal
allows for safe comparisons between numpy and pandas dtypes:
In [2]: ndt = np.dtype(object)
In [3]: pdt = pd.api.types.CategoricalDtype(categories=['German', 'English', 'French'])
In [4]: pd.core.dtypes.common.is_dtype_equal(ndt, pdt)
Out[4]: False
In [5]: pd.core.dtypes.common.is_dtype_equal(pdt, ndt)
Out[5]: False
Comment From: topper-123
Nice, thanks @jschendel. This solves my problem, though this will probably trip up some people from time to time.
BTW, this is also part of the public API as pd.api.types.is_dtype_equal
Comment From: adamczykm
Me and probably many others wasted their time because of this unexpected behaviour of comparing dtypes with '=='. Why can't it be implemented in numpy-compatible fashion -.- ?