Pandas Using DataFrame.rename changes the type of an index

Code Sample, a copy-pastable example if possible

df = pd.DataFrame({1:['a','b','c'], 2:['d','e','f']})
print(df.index)
df2 = df.rename(index=str, columns={1:'col1', 2:'col2'})
print(df2.index)

Problem description

The issue is that, while df has an index type of RangeIndex (as expected), df2 has an index type of Index in which the original integers have been converted to strings. The same also applies for MultiIndex

Expected Output

I expect the new dataframe, in which I only renamed a few columns, to have the same index.

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 61 Stepping 4, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None pandas: 0.21.1 pytest: None pip: 9.0.1 setuptools: 38.2.4 Cython: None numpy: 1.13.3 scipy: None pyarrow: None xarray: None IPython: 6.2.1 sphinx: None patsy: None dateutil: 2.6.1 pytz: 2017.3 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.0.0 openpyxl: 2.4.2 xlrd: 1.0.0 xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 1.0.1 sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None

Comment From: jschendel

This is the expected behavior for the code you posted. You're passing the callable str as the function to be applied the index, meaning all index values will be converted to strings. RangeIndex only supports integer valued indexes, so the type of index must change to something that can support object dtype, which is the base Index class.

If you only want to rename columns, and leave the index unchanged, don't pass anything to the index parameter: df2 = df.rename(columns={1:'col1', 2:'col2'}).

In [2]: df = pd.DataFrame({1:['a','b','c'], 2:['d','e','f']})

In [3]: df.index
Out[3]: RangeIndex(start=0, stop=3, step=1)

In [4]: df2 = df.rename(columns={1:'col1', 2:'col2'})

In [5]: df2.index
Out[5]: RangeIndex(start=0, stop=3, step=1)

In [6]: df2
Out[6]:
  col1 col2
0    a    d
1    b    e
2    c    f

Comment From: m3h0w

It's worth looking into if maybe the documentation on this should be modified, because it includes a misleading example that led the OP and me to struggle with this.

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.rename.html

The example:

df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
df.rename(index=str, columns={"A": "a", "B": "c"})
   a  c
0  1  4
1  2  5
2  3  6

The result of using argument 'str' for index is not noticeable, thus misleading.

Comment From: seanlseymour

After wasting the last three hours because of this 'disconnect', I agree strongly that the documentation needs to be modified here. I read it at least 10x, and it was not clear that the index type was being modified. As m3h0w points out, it's not clear from the output example that this is happening, and frankly, I don't think most people will want to do this! I just wanted to change column header names. This was painfully difficult to troubleshoot. Please change the docs to be clearer.

Comment From: jreback

@seanlseymour you are welcome to submit a PR to modify the documentation; this is open source

Pandas Using DataFrame.rename changes the type of an index

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

Output of `pd.show_versions()`