Code Sample, a copy-pastable example if possible
pd.isnull(np.array([np.nan, 'asdf', np.nan]))
Problem description
Documentation shows that pd.isnull
should be able to take an np.ndarray
and operate on it. This means that np.nan
values in the array should return True
. However, when a string is present, np.nan
values return False
.
Expected Output
Observed behaviour
In [3]: pd.isnull(np.array([np.nan, 'asdf', np.nan]))
Out[3]: array([False, False, False], dtype=bool)
Expected Behaviour
In [3]: pd.isnull(np.array([np.nan, 'asdf', np.nan]))
Out[3]: array([True, False, True], dtype=bool)
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Darwin
OS-release: 16.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: en_CA.UTF-8
pandas: 0.20.2
pytest: 3.1.2
pip: 9.0.1
setuptools: 36.0.1
Cython: None
numpy: 1.13.0
scipy: 0.19.0
xarray: None
IPython: 5.3.0
sphinx: 1.5.3
patsy: None
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.8
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None
Comment From: TomAugspurger
This is a numpy thing you need to watch out for. The float np.nan
is coereced to a string 'nan'
by the np.array
constructor.
In [18]: x = np.array([np.nan, 'asdf'])
In [21]: x
Out[21]:
array(['nan', 'asdf'],
dtype='<U32')
In [22]: type(x[0])
Out[22]: numpy.str_
You need to pass the dtype if you want to disable the type inference numpy does:
In [29]: x = np.array([np.nan, 'asdf'], dtype=object)
In [30]: x
Out[30]: array([nan, 'asdf'], dtype=object)
In [31]: pd.isnull(x)
Out[31]: array([ True, False], dtype=bool)