I get a SettingWithCopyWarning after the use of dropna
, which does not really seem logical:
In [102]: df = pd.DataFrame({'a':[1,2,np.nan], 'b':['A', 'B', 'C']})
In [103]: df
Out[103]:
a b
0 1 A
1 2 B
2 NaN C
In [104]: df = df.dropna(subset=['a'])
In [105]: df
Out[105]:
a b
0 1 A
1 2 B
In [106]: df['new_col'] = 2
C:\Anaconda\envs\devel\Scripts\ipython-script.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/s
table/indexing.html#indexing-view-versus-copy
if __name__ == '__main__':
In [107]: df.loc[:,'new_col'] = 2
c:\users\vdbosscj\scipy\pandas-joris\pandas\core\indexing.py:404: SettingWithCop
yWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/s
table/indexing.html#indexing-view-versus-copy
self.obj[item] = s
Further, also the indication were the warning occurred is not correct I think.
Comment From: jreback
works on master for me.
In [1]: df = pd.DataFrame({'a':[1,2,np.nan], 'b':['A', 'B', 'C']})
In [2]: df = df.dropna(subset=['a'])
In [3]: df['new_col'] = 2
This would be a setting with copy, except that you are reassigning the reference and it is garbage collected, so its ok.
Comment From: jorisvandenbossche
Whoops, I was accidentally using 0.15.1, didn't check with 0.16.1 (or master). Indeed this does not raise!
Comment From: jorisvandenbossche
For the case you reassign to another name:
df2 = df.dropna(subset=['a'])
df2['new_col'] = 2
it does raise. But personally, I still find this a bit strange. As a user, I wouldn't expect it that my change would be applied on the original df
(I expext dropna
to take a copy). So in that sense, it is very strange to get this warning. Also, you get it with df2.loc[:, 'new_col'] = 2
as well, where the warnings says to use .loc[...]
which I am already doing.
Comment From: jreback
.dropna
is just a fancy name for .ix
with an inverse indexer
e.g.
In [30]: df.ix[pd.notnull(df.a)]
Out[30]:
a b
0 1 A
1 2 B
In [31]: df.ix[pd.notnull(df.a)].is_copy
Out[31]: <weakref at 0x121aa7f70; to 'DataFrame' at 0x121a75c90>
we could make dropna
return a copy, though its really just a slicing operation, hence the warning
Comment From: jreback
.dropna
has inplace=False
by default, so this might be a bug. why don't you rename, reopen (and change the top example)