# With duplicate columns
df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=list('abb'))
result = df.drop(columns='a').align(df, join="inner", copy=False)
print(result[0])
# b b b b
# 0 2 2 3 3
# 1 5 5 6 6
# Without duplicate columns
df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=list('abc'))
result = df.drop(columns='a').align(df, join="inner", copy=False)
print(result[0])
# b c
# 0 2 3
# 1 5 6
# With duplicate columns but without the drop(columns=...)
df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=list('abb'))
result = df.align(df, join="inner", copy=False)
print(result[0])
# a b b
# 0 1 2 3
# 1 4 5 6
Duplicate columns get duplicated again in alignment. If self
and other
have the same columns, then a different code path is taken and the bug does not appear.
Comment From: asadullahnaeem
# With duplicate columns
df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=list('abb'))
result = df.drop(columns='a').align(df, join="inner", copy=False)
print(result[0])
# b b
# 0 2 3
# 1 5 6
In this case, is this the expected output?
Comment From: rhshadrach
@asadullahnaeem - yes, that's correct.
Comment From: asadullahnaeem
take