Pandas version checks

  • [x] I have checked that this issue has not already been reported.

  • [x] I have confirmed this bug exists on the latest version of pandas.

  • [ ] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

idx = pd.date_range("2025-01-29 01:36",periods=4,freq="1 min",unit="us")
ab = pd.DataFrame(index=idx,data=dict(a=[1,2,3,4],b=[2,2,2,2]))
cd = pd.DataFrame(index=idx[:3],data=dict(c=[9,8,7],d=[6,6,6]))

abcd = pd.concat([ab,cd],axis="columns")
print(abcd)
assert abcd.shape[0] == 4

Issue Description

The above example attempts to concatenate a 4-row DataFrame with a 3-row DataFrame by joining on the index; the first three index values match exactly. The expected result is a 4-row DataFrame as follows, that you can obtain with unit="ns":

                     a  b    c    d
2025-01-29 01:36:00  1  2  9.0  6.0
2025-01-29 01:37:00  2  2  8.0  6.0
2025-01-29 01:38:00  3  2  7.0  6.0
2025-01-29 01:39:00  4  2  NaN  NaN

Instead, with unit="us" the result is a 2-row DataFrame, where the first row is correct, and the second row is completely made up from outer space:

                       a    b    c    d
2025-01-29 01:36:00  1.0  2.0  9.0  6.0
2025-01-29 18:16:00  NaN  NaN  NaN  NaN

With unit="s", the result is similarly wrong, but with a different made-up row index:

                       a    b    c    d
2025-01-29 01:36:00  1.0  2.0  9.0  6.0
3926-05-28 12:16:00  NaN  NaN  NaN  NaN

Expected Behavior

The correct result is returned.

Installed Versions

I noticed the issue with the following versions: (latest python 3.13.2, latest pandas 2.2.3):

INSTALLED VERSIONS ------------------ commit : 0691c5cf90477d3503834d983f69350f250a6ff7 python : 3.13.2 python-bits : 64 OS : Darwin OS-release : 24.3.0 Version : Darwin Kernel Version 24.3.0: Thu Jan 2 20:24:16 PST 2025; root:xnu-11215.81.4~3/RELEASE_ARM64_T6000 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : en_GB.UTF-8 LOCALE : None.UTF-8 pandas : 2.2.3 numpy : 2.1.3 pytz : 2025.1 dateutil : 2.9.0.post0 pip : 25.0.1 Cython : None sphinx : None IPython : 8.32.0 adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : 4.13.3 blosc : None bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : None html5lib : None hypothesis : None gcsfs : None jinja2 : 3.1.5 lxml.etree : None matplotlib : 3.10.0 numba : 0.61.0 numexpr : None odfpy : None openpyxl : None pandas_gbq : None psycopg2 : None pymysql : None pyarrow : 19.0.0 pyreadstat : None pytest : None python-calamine : None pyxlsb : None s3fs : None scipy : 1.15.1 sqlalchemy : None tables : None tabulate : None xarray : 2025.1.2 xlrd : None xlsxwriter : None zstandard : None tzdata : 2025.1 qtpy : None pyqt5 : None

I also reproduced on python 3.11 with pandas 2.2.0

Comment From: burnpanck

The issue appears with: - Both join="outer" and join="inner" - No timezone or tz="UTC" - Any order of the input DataFrames - For any kind of "trailing row index mismatch" (even worse with a "leading row index mismatch", where cd = pd.DataFrame(index=idx[1:],...)

But it does not appear if both indexes match exactly.

Comment From: rhshadrach

Thanks for the report! I get the correct result on main:

idx = pd.date_range("2025-01-29 01:36",periods=4,freq="1 min",unit="us")
ab = pd.DataFrame(index=idx,data=dict(a=[1,2,3,4],b=[2,2,2,2]))
cd = pd.DataFrame(index=idx[:3],data=dict(c=[9,8,7],d=[6,6,6]))

abcd = pd.concat([ab,cd],axis="columns")
print(abcd)
#                      a  b    c    d
# 2025-01-29 01:36:00  1  2  9.0  6.0
# 2025-01-29 01:37:00  2  2  8.0  6.0
# 2025-01-29 01:38:00  3  2  7.0  6.0
# 2025-01-29 01:39:00  4  2  NaN  NaN

idx = pd.date_range("2025-01-29 01:36",periods=4,freq="1 min",unit="s")
ab = pd.DataFrame(index=idx,data=dict(a=[1,2,3,4],b=[2,2,2,2]))
cd = pd.DataFrame(index=idx[:3],data=dict(c=[9,8,7],d=[6,6,6]))

abcd = pd.concat([ab,cd],axis="columns")
print(abcd)
#                      a  b    c    d
# 2025-01-29 01:36:00  1  2  9.0  6.0
# 2025-01-29 01:37:00  2  2  8.0  6.0
# 2025-01-29 01:38:00  3  2  7.0  6.0
# 2025-01-29 01:39:00  4  2  NaN  NaN

There has been a lot of work with the non-nano resolution, so I don't think it's unexpected that this is fixed on main. This will be released as part of 3.0.