Pandas version checks
-
[x] I have checked that this issue has not already been reported.
-
[x] I have confirmed this bug exists on the latest version of pandas.
-
[x] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
df = pd.DataFrame({'a': [0, 1], 'b': [2, 3]})
s = pd.Series([4, 5], name='c')
pd.concat([df, s]) # columns are ['a', 'b', 'c']
pd.concat([df, s], ignore_index=True) # columns are ['a', 'b', 0]
pd.concat([df, s.to_frame()]) # columns are ['a', 'b', 'c']
pd.concat([df, s.to_frame()], ignore_index=True) # columns are ['a', 'b', 'c']
Issue Description
When I concatenate a dataframe with a series and pass ignore_index=True
, the series' name does not show up in the resulting dataframe. This is surprising, because the documentation for ignore_index
says, "Note the index values on the other axes are still respected in the join."
This seems similar to #56257. That concerned the case in which the name of the series is the same as the name of one of the columns of the dataframe but did not involve ignore_index
. That issue has a stale PR (#56362) that looks like it might fix this one, too.
Expected Behavior
Doing pd.concat([df, s], ignore_index=True)
should preserve the name of the series.
Installed Versions
Comment From: speco29
It's true that you can not have the series index without converting into a Dataframe first. To merge a Series with a Dataframe without changing the Series into a Dataframe, while using ignore_index=True, you can convert the Series to a DataFrame temporarily within the pd.concat function. Here's how you can do it:
`import pandas as pd
df = pd.DataFrame({'a': [0, 1], 'b': [2, 3]}) s = pd.Series([4, 5], name='c')
Concatenating DataFrame and Series with ignore_index=True
result = pd.concat([df, s.to_frame().T], ignore_index=True)
print(result) `
Comment From: rhshadrach
Thanks for the report, agreed with the expectation to maintain the Series name here. Further investigations and PRs to fix are welcome!
Comment From: Anurag-Varma
take
Comment From: Anurag-Varma
*Edit:
Test case:
pandas/tests/reshape/concat/test_concat.py::TestConcatenate::test_concat_mixed_objs_columns
Outdated doubt in test case so editing it.
Comment From: rhshadrach
I am thinking that the test case expected output is wrong here !
The expected output looks "correct" to me. When you don't use to_frame
, concat is enumerating the Series with name None
instead of always using 0
. This seems to me to be a more desirable behavior than always using 0
. Perhaps there are some improvements we could do here in avoiding conflicts, and perhaps one could argue that the resulting columns should be None
, but having them all come out as 0
seems to me to be worse behavior than what we have today.
Comment From: Anurag-Varma
Thanks @rhshadrach
My bad, here the axis of col and row were flipped and i didn't see that before.
Now i changed the code and did a new commit in my PR