Pandas (pandas.indexes.base.Index == np.dtype('float64')) is True

Code Sample, a copy-pastable example if possible

I came across a weird behavior in the following code

>>> import numpy as np
>>> import pandas as pd
>>> type(pd.Index(['a', 'b' , 'c'])) == np.dtype('float64')
True

The example can be simplified as such

>>> pd.Index == np.dtype('float64')
True

Problem description

The index has a dtype of object, but it returns that it is equal to a np.dtype('float64'). It does not do this for any other dtype.

>>> x = pd.Index(['a', 'b' , 'c'])
>>> print(repr(x.dtype))
dtype('O')
>>> print(type(x))
<class 'pandas.indexes.base.Index'>
>>> print(repr(np.dtype('float64')))
dtype('float64')

And this is not true for any other dtype or basic numpy type

>>> pd.Index == np.dtype('float128')
False
>>> pd.Index == np.dtype('int32')
False
>>> pd.Index == np.dtype('float32')
False
>>> # This is also false, it is only equal to a numpy dtype(), not a basic numpy type
>>> pd.Index == np.float64
False

The type of a pd.Index object returns that it is equal to a np.dtype('float64'). This seems like a side effect of how the pd.Index object is defined.

Perhaps this is a bug that should be fixed?

Expected Output

# Paste the output here pd.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-75-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 pandas: 0.19.2 nose: 1.3.7 pip: 9.0.1 setuptools: 34.1.1 Cython: 0.25.2 numpy: 1.12.1 scipy: 0.18.1 statsmodels: 0.8.0 xarray: None IPython: 5.2.2 sphinx: None patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: None tables: None numexpr: None matplotlib: 2.0.2 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: 4.5.3 html5lib: 0.999999999 httplib2: None apiclient: None sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.9.5 boto: 2.45.0 pandas_datareader: None

Comment From: jreback

this has nothing to do with the actual dtype itself

In [5]: pd.Series == np.dtype('float64')
   ...: 
Out[5]: False

In [6]: pd.Index == np.dtype('float64')
   ...: 
Out[6]: True

We assign __eq__ and friends to various sub-classes of Index to provide proper comparators. I guess np.dtype is slipping thru here. Though to be honest I am not sure why you are doing this.

Comment From: jreback

the issue is here

I think it needs an else clause after

                if isinstance(other, (np.ndarray, Index, ABCSeries)):
                    if other.ndim > 0 and len(self) != len(other):
                        raise ValueError('Lengths must match to compare')

Comment From: dsm054

I don't think that's right, because that comparison function should only be used on an instance of pd.Index, I think. I think the real issue is on the numpy side, because

In [37]: np.dtype("float32") == None
Out[37]: False

In [38]: np.dtype("float64") == None
Out[38]: True

and the __eq__ for dtype objects seems to look at the .dtype attribute if it exists:

In [39]: print(pd.Index.dtype)
None

Comment From: Erotemic

@jreback, The reason why I ran across this has to do with an old function I wrote to determine if a variable is float-like, meaning it could either be a float or numpy array of floats. At the time I didn't know dtypes has string identifiers, and if I rewrote the code today I would use isinstance and those identifiers. However, the function works well as a utility, even if it doesn't follow best practices, so I continue using it.

Comment From: chris-b1

Yes - this is a numpy issue (https://github.com/numpy/numpy/issues/2190#issuecomment-35576788) a symptom of there being an implicit default dtype (e.g.)

In [62]: np.dtype(None)
Out[62]: dtype('float64')