Pandas version checks

  • [X] I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html

Documentation problem

pandas.concat says (of the sort argument):

Sort non-concatenation axis if it is not already aligned when join is ‘outer’. This has no effect when join='inner', which already preserves the order of the non-concatenation axis.

My reading of this is that irrespective of whether I pass sort=True or sort=False if I am using join="inner", I should expect the same result. But:

import pandas as pd

df1 = pd.DataFrame({"b": [1], "a": [2]})
df2 = pd.DataFrame({"a": [3], "c": [4], "b": [5]})

dfFalse = pd.concat([df1, df2], sort=False, join="inner")
dfTrue = pd.concat([df1, df2], sort=True, join="inner")

dfFalse == dfTrue # => ValueError("Can only compare identically-labeled DataFrames")

I was (perhaps incorrectly) expecting that sort= would have no effect.

Suggested fix for documentation

If this is not a code bug, I would suggest changing the docstring to say:

Sort non-concatenation axis if it is not already aligned.

Indicating that the sort will always be applied.

Comment From: mroeschke

Thanks for the report. Agreed that This has no effect when join='inner', which already preserves the order of the non-concatenation axis. is misleading and should be removed as your example exhibits the correct behavior. We have a test test_concat_inner_sort that confirms the behavior you're seeing