Code Sample, a copy-pastable example if possible

def rename_levels(df, names, level=None):
    df.index = df.index.set_names(names, level=level)
    return df

(data.col1
        .pipe(lambda x: rename_levels(x, ['plot_name1', 'plot_name2']))
        .plot
        .barh(color=['#E10019', '#FFC300', '#69AF23'], stacked=True))

plt.show()

data.groupby(level='name1')

AssertionError: Level mpan not in index

Problem description

I have a dataframe which I wish to plot from. I use method chaining so that I can plot a graph without affecting the original dataframe data. I wanted to change the index-column-names so used the rename_levels function in a pipe method (it seems this is the only way to do this in a method chain).

All seems good until I used a groupby referencing the old index-column-name and this fails. It seems that there is some overwriting of the dataframe within the function but I cannot figure out why, or if this is intended/correct behaviour.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Linux OS-release: 3.16.0-4-amd64 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 pandas: 0.20.2 pytest: 2.9.2 pip: 9.0.1 setuptools: 27.2.0 Cython: 0.24.1 numpy: 1.12.1 scipy: 0.19.0 xarray: None IPython: 5.1.0 sphinx: 1.4.6 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: 1.2.0 tables: 3.4.2 numexpr: 2.6.2 feather: None matplotlib: 2.0.2 openpyxl: 2.3.2 xlrd: 1.0.0 xlwt: 1.1.2 xlsxwriter: 0.9.3 lxml: 3.6.4 bs4: 4.5.1 html5lib: 0.999999999 sqlalchemy: 1.0.13 pymysql: 0.7.9.None psycopg2: None jinja2: 2.8 s3fs: None pandas_gbq: None pandas_datareader: None

Comment From: jorisvandenbossche

The reason pipe modifies your original dataframe in this case, is because the function rename_levels that you have written modifies the passed dataframe in place (df.index = df.index.set_names(..) will modify the index of df in place). If you want to avoid this, you could add a df.copy() in that function.

Comment From: JoshuaC3

Brilliant, thanks. I was 99% sure that this was the case but couldn't figure out why or whether this was intended Pandas functionality. I can see that this is just python. What I dont understand is why it modifies it in place. Is it being treated as a Global variable within the function?

Comment From: jorisvandenbossche

I can see that this is just python. What I dont understand is why it modifies it in place. Is it being treated as a Global variable within the function?

This is how python works, the object passed to a function can be modified if it is a mutable object. See eg https://stackoverflow.com/questions/986006/how-do-i-pass-a-variable-by-reference or https://jeffknupp.com/blog/2012/11/13/is-python-callbyvalue-or-callbyreference-neither/