Pandas version checks

  • [x] I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.set_index.html

Documentation problem

set_index(), when applied to a DataFrame which already has a data column (non-default) assigned as index, will delete this data column from the DataFrame when assigning another data column to be the index.

While I find this behaviour inappropriate, I understand that reset_index() should be used before set_index(), in which case the original index column may be preserved.

The problem is that the documentation for set_index() does not mention this at all, so the user is left to discover the problem and then the way to avoid it.

Suggested fix for documentation

Add a comment in the set_index documentation to clarify that setting a data column as index, when there is already a different data column serving as index, will delete that data column, unless reset_index is performed first.

Comment From: SaraInCode

take

Comment From: rhshadrach

Thanks for the report!

The problem is that the documentation for set_index() does not mention this at all, so the user is left to discover the problem and then the way to avoid it.

The documentation states:

The index can replace the existing index or expand on it.

Doesn't this count as a mention?