Location of the documentation

https://dev.pandas.io/docs/reference/api/pandas.DataFrame.merge.html

Documentation problem

There is no documentation on this case

Suggested fix for documentation

I'm a bit confused about the behavior here, because I don't know what to expect. When I perform the merge as shown below, I would expect that both operations return the same output.

left = pd.DataFrame({'a': [1, 2], 'b': [1, 1], "l": [22, 23]}).set_index(['a', 'b'])
right = pd.DataFrame({'b': [1], "r": [12]}).set_index(['b'])
print(pd.merge(left, right, left_on=['b'], right_index=True, how="left"))
print(pd.merge(left, right, left_on=['b'], right_on=["b"], how="left"))

But the Index of both results differs. The first merge has the same index as the left DataFrame while the second merge has only b as index. As a result, the index from the second merge is no longer unique.

First Merge:

      l   r
a b        
1 1  22  12
2 1  23  12

Second Merge:

    l   r
b        
1  22  12
1  23  12

If this is the desired behavior, we should adjust the documentation to show this. If it is not, I would file an BUG issue. We should adjust the documentation nevertheless. I would add an example with an index join to show the right behavior.

Comment From: TomAugspurger

This is probably buggy, but I'm not entirely sure either. I think they should have the same index (I think the second one).

Comment From: bjornasm

Is there any progress here? Would be nice with a clarification if all merging should happen by having the dataframes have the same index or not. I don't see why it isn't straight forward to merge by choosing any columns in the dataframes, as you would expect its the content of the columns that have to match?