Code Sample, a copy-pastable example if possible
import pandas as pd
import numpy as np
df = pd.DataFrame([[0,0,None],
[0,0,None],
[1,1,None]],
columns = ['A','B','C']).set_index(['A','B'])
# This succeeds
idx = (0,0)
df.set_value(idx,'C',5)
print df
# This fails
idx = (np.int64(0),np.int64(0))
df.set_value(idx,'C',8)
print df
Problem description
The above does a multirow assignment through the DataFrame.set_value()
function by two different indices. In the first case the index is built by plain python values. This succeeds. In the second case, which is superficially equal, the index is built with numpy.int64 integers (that I actually got back from a groupby call
). This case fails.
Is this behavior expected or is this a bug?
Expected Output
C
A B
0 0 5
0 5
1 1 None
C
A B
0 0 8
0 8
1 1 None
Output of pd.show_versions()
Comment From: chris-b1
This probably should work, but more idiomatic to use loc
which handles this already
In [27]: df.loc[(np.int64(0),np.int64(0)), 'C'] = 5
In [28]: df
Out[28]:
C
A B
0 0 5
0 5
1 1 None
Comment From: chris-b1
xref #15269 - proposal to deprecate set_value
Comment From: jorisvandenbossche
Closing this, as we deprecated set_value
/get_value
(although at
has the same behaviour, but I would say that if you want to use numpy scalars, you need to use loc
, to keep at
as the more advanced but faster one)