Pandas isnull always returns False if input is np.ndarray with string inside

Code Sample, a copy-pastable example if possible

pd.isnull(np.array([np.nan, 'asdf', np.nan]))

Problem description

Documentation shows that pd.isnull should be able to take an np.ndarray and operate on it. This means that np.nan values in the array should return True. However, when a string is present, np.nan values return False.

Expected Output

Observed behaviour

In [3]: pd.isnull(np.array([np.nan, 'asdf', np.nan]))
Out[3]: array([False, False, False], dtype=bool)

Expected Behaviour

In [3]: pd.isnull(np.array([np.nan, 'asdf', np.nan]))
Out[3]: array([True, False, True], dtype=bool)

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Darwin OS-release: 16.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_CA.UTF-8 LOCALE: en_CA.UTF-8 pandas: 0.20.2 pytest: 3.1.2 pip: 9.0.1 setuptools: 36.0.1 Cython: None numpy: 1.13.0 scipy: 0.19.0 xarray: None IPython: 5.3.0 sphinx: 1.5.3 patsy: None dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.0.2 openpyxl: 2.4.8 xlrd: 1.0.0 xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 0.999999999 sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: None

Comment From: TomAugspurger

This is a numpy thing you need to watch out for. The float np.nan is coereced to a string 'nan' by the np.array constructor.

In [18]: x = np.array([np.nan, 'asdf'])

In [21]: x
Out[21]:
array(['nan', 'asdf'],
      dtype='<U32')

In [22]: type(x[0])
Out[22]: numpy.str_

You need to pass the dtype if you want to disable the type inference numpy does:

In [29]: x = np.array([np.nan, 'asdf'], dtype=object)

In [30]: x
Out[30]: array([nan, 'asdf'], dtype=object)

In [31]: pd.isnull(x)
Out[31]: array([ True, False], dtype=bool)

Pandas isnull always returns False if input is np.ndarray with string inside

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

Output of `pd.show_versions()`