After applying DataFrame.update() on a DataFrame that has boolean column the type of that column changes to Object and filtering stops working as negate operator (~) produces '-1' series instead of 'True'.
Code Sample, a copy-pastable example if possible
df = pd.DataFrame({'a': [1, 2, 3], 'b': ['notset'] * 3, 'c': [False] * 3})
df.set_index('a', inplace=True)
df_update_from = pd.DataFrame({'a': [2, 3], 'b': ['updated2', 'updated3'], 'c': [False, False]})
df_update_from.set_index('a', inplace=True)
df.update(df_update_from)
~df.c
Expected Output
a 1 True 2 True 3 True Name: c, dtype: bool
Actual Output
a 1 -1 2 -1 3 -1 Name: c, dtype: object
output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 2.7.9.final.0 python-bits: 64 OS: Linux OS-release: 2.6.18-238.9.1.el5 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US
pandas: 0.16.2 nose: 1.3.4 Cython: 0.22 numpy: 1.9.2 scipy: 0.15.1 statsmodels: 0.6.1 IPython: 3.0.0 sphinx: 1.3.1 patsy: 0.3.0 dateutil: 2.4.1 pytz: 2016.3 bottleneck: None tables: 3.1.1 numexpr: 2.3.1 matplotlib: 1.4.3 openpyxl: 1.8.5 xlrd: 0.9.3 xlwt: 0.7.5 xlsxwriter: 0.6.7 lxml: 3.4.2 bs4: 4.3.2 html5lib: None httplib2: None apiclient: None sqlalchemy: 1.0.9 pymysql: None psycopg2: None
Comment From: jorisvandenbossche
@Sereger13 Thanks for the report.
This is the result of the provided dataframe in update being reindexed to the index/columns of the calling dataframe, which can introduce NaN values if both are not aligned, which in turn can cause dtype changes. For example, you see this for integer dtypes as well:
In [28]: df2 = pd.DataFrame({'a':[1,2,3]})
In [29]: df_update_from2= pd.DataFrame({'a':[20,30]}, index=[1,2])
In [30]: df2.dtypes
Out[30]:
a int64
dtype: object
In [32]: df2.update(df_update_from2)
In [33]: df2
Out[33]:
a
0 1.0
1 20.0
2 30.0
In [34]: df2.dtypes
Out[34]:
a float64
dtype: object
Maybe we have to check if the data can be cast again to the original dtypes after updating.
Comment From: jreback
this is a dupe of #4094. .update()
is in general not very dtype friendly. pull-requests are welcome.