I was looking to replace all np.nan
values in a dataframe with None
, I was trying to do this using fillna
, but it seems like this is not supported (through fillna
, though you can use where
):
In [1]: import pandas as pd
i
In [2]: import numpy as np
In [3]: df = pd.DataFrame([[np.nan, 1], [1, np.nan]])
In [4]: df.fillna(value=None)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-4-f459dbd09698> in <module>()
----> 1 df.fillna(value=None)
/Users/yoshikivazquezbaeza/.virtualenvs/ipy/lib/python2.7/site-packages/pandas/core/frame.pyc in fillna(self, value, method, axis, inplace, limit, downcast, **kwargs)
2530 axis=axis, inplace=inplace,
2531 limit=limit, downcast=downcast,
-> 2532 **kwargs)
2533
2534 @Appender(_shared_docs['shift'] % _shared_doc_kwargs)
/Users/yoshikivazquezbaeza/.virtualenvs/ipy/lib/python2.7/site-packages/pandas/core/generic.pyc in fillna(self, value, method, axis, inplace, limit, downcast)
2522 if value is None:
2523 if method is None:
-> 2524 raise ValueError('must specify a fill method or value')
2525 if self._is_mixed_type and axis == 1:
2526 if inplace:
ValueError: must specify a fill method or value
This is what I really was expecting to see:
In [6]: df.where((pd.notnull(df)), None)
Out[6]:
0 1
0 None 1
1 1 None
Info about my system:
commit: None
python: 2.7.9.final.0
python-bits: 64
OS: Darwin
OS-release: 14.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
pandas: 0.16.2
nose: 1.3.4
Cython: None
numpy: 1.9.2
scipy: 0.14.0
statsmodels: None
IPython: 3.0.0
sphinx: 1.2.2
patsy: None
dateutil: 2.4.2
pytz: 2015.4
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.4.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
Comment From: jreback
in a float column, None
is de-facto represented by np.nan
(and in most types). So this doesn't make any sense. see the docs here
None
makes things into object
dtype which is rarely a good idea in any event.
Comment From: ElDeveloper
I see, thanks for the explanation!
On (Aug-20-15|15:19), Jeff Reback wrote:
in a float column,
None
is de-facto represented bynp.nan
(and in most types). So this doesn't make any sense. see the docs here
None
makes things intoobject
dtype which is rarely a good idea in any event.
Reply to this email directly or view it on GitHub: https://github.com/pydata/pandas/issues/10871#issuecomment-133196500