Code Sample, a copy-pastable example if possible
# Your code here
data = pd.DataFrame( {'ab' : ['A','B','A','A','B'], 'num' : ['01','02','01','01','01']})
a_replacements = { 'num' : { '01' : 'funny', '02' : 'serious' }}
b_replacements = { 'num' : { '01' : 'beginning', '02' : 'end' }}
data[data.ab == 'A'].replace(inplace=True, to_replace=a_replacements)
Problem description
The reason I have the mask (data.ab == 'A') is because there are two levels for A and two levels for B. If I were to run data.replace(inplace=True, to_replace=a_replacements
, for rows where ab==B
column num
would be encoded as funny
or serious
instead of beginning
or end
.
Expected Output
My thought is that the last line of the above code block should result in a data frame that has all the 01's and 02's replaced for all rows that have 'ab' == 'A'. But this does not work. No exception is thrown. Not quite sure what I'm doing wrong.
Output of pd.show_versions()
# Paste the output here pd.show_versions() here
```INSTALLED VERSIONS
------------------
commit: None
python: 3.6.0.final.0
python-bits: 64
OS: Darwin
OS-release: 16.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 27.2.0
Cython: None
numpy: 1.12.0
scipy: None
statsmodels: None
xarray: None
IPython: 5.3.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.5
boto: None
pandas_datareader: None
</details>
**Comment From: TomAugspurger**
Copy-pastable example:
```python
In [28]: data = pd.DataFrame({"num": ['01', '02']})
In [29]: data[[True, False]].replace({"num": {"01": "funny", "02": "begining"}}, inplace=True)
/Users/tom.augspurger/Envs/py3/lib/python3.6/site-packages/pandas/pandas/core/generic.py:3664: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
regex=regex)
In [30]: data[[True, True]].replace({"num": {"01": "funny", "02": "begining"}}, inplace=True)
In [31]: data
Out[31]:
num
0 01
1 02
So I think `In[29]` shows why this isn't working for you. Your slice `data[data.ab == 'A']` may be a copy, and so your inplace replace is operating on a copy, not the original, so it looks like it's not working.
The potential bug here is why an all-True mask didn't raise the SettingWithCopy warning.
As you can see, you're probably better off not using `inplace`.
**Comment From: jreback**
This is not a correct usage. As indicated by the error message it *might* work, but is not idiomatic.
In [4]: data[data.ab == 'A'].replace(inplace=True, to_replace=a_replacements)
/Users/jreback/pandas/pandas/core/generic.py:3664: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
regex=regex)
Instead use this pattern. The rhs is aligned to the lhs. This is the what pandas does for you by default. (you can also use ``.loc[data.ab=='A'].replace(...)`` on the rhs if its more clear.
In [14]: data.loc[data.ab=='A'] = data.replace(to_replace=a_replacements)
In [15]: data
Out[15]:
ab num
0 A funny
1 B 02
2 A funny
3 A funny
4 B 01
**Comment From: alokshenoy**
Thanks for the solution. It works well when I'm trying to replace values in ```num``` for each case ```ab == 'A', ab=='B'```
Let's say I have 26 cases (```ab=='C'....ab=='Z'```) and I'm trying to use a for loop to iterate through those cases, I get a TypeError
```TypeError: cannot replace ['a_replacements'] with method pad on a DataFrame```
code:
for letter in data.ab.unique():
data.loc[data.ab == letter] = data.replace(to_replace=letter+"_replacements")
To which I get :
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-96-acd3197ceef4> in <module>()
1 for letter in data.ab.unique():
2 print(letter.lower()+"_replacements")
----> 3 data.loc[data.ab == letter] = data.replace(to_replace=letter.lower()+"_replacements")
/Users/alokshenoy/.pyenv/versions/miniconda3-latest/lib/python3.6/site-packages/pandas/core/generic.py in replace(self, to_replace, value, inplace, limit, regex, method, axis)
3427 if isinstance(to_replace, (tuple, list)):
3428 return _single_replace(self, to_replace, method, inplace,
-> 3429 limit)
3430
3431 if not is_dict_like(to_replace):
/Users/alokshenoy/.pyenv/versions/miniconda3-latest/lib/python3.6/site-packages/pandas/core/generic.py in _single_replace(self, to_replace, method, inplace, limit)
70 if self.ndim != 1:
71 raise TypeError('cannot replace {0} with method {1} on a {2}'
---> 72 .format(to_replace, method, type(self).__name__))
73
74 orig_dtype = self.dtype
TypeError: cannot replace ['a_replacements'] with method pad on a DataFrame
Also tried the rhs = lhs way, and that throws the same error. Curious as to what changes once inside the for loop?
**Comment From: jreback**
@alokshenoy you should ask on Stack Overflow