• [x] I have checked that this issue has not already been reported.

  • [x] I have confirmed this bug exists on the latest version of pandas.

  • [x] (optional) I have confirmed this bug exists on the master branch of pandas.


I have observed a few issues, when constructing a dataframe from a list of series with multiindices, if any multiindex contains (NaN, NaN).

Code Sample, a copy-pastable example

  1. In this case I would expect the columns of the dataframe to be the union of the indices of series1 and series2, but ("a", "b") is missing.
import pandas as pd
series1 = pd.Series((1,), index=pd.MultiIndex.from_tuples(((None, None),)))
series2 = pd.Series((10, 20), index=pd.MultiIndex.from_tuples(((None, None), ("a", "b"))))
df = pd.DataFrame([series1, series2])
#   NaN
#   NaN
# 0   1
# 1  10
  1. There is another issue, when series1 and series2 are flipped. The columns appear to be correct, but in this case I would expect the value NaN at index 1, column (a,b)):
df = pd.DataFrame([series2, series1])
#     a NaN
#     b NaN
# 0  20  10
# 1   1   1

Problem description

Expected Output

df = pd.DataFrame([series1, series2])
#     a NaN
#     b NaN
# 0  NaN 1
# 1  20  10

and the reverse case

df = pd.DataFrame([series2, series1])
#     a NaN
#     b NaN
# 0  20  10
# 1  NaN 1

This behavior is not limited to both series containing the MultiIndex (NaN, NaN). Here is an example, where only one series is constructed with (None, None). The column ("a", "b") is missing.

import pandas as pd
series1 = pd.Series((1,), index=pd.MultiIndex.from_tuples((("a", None),)))
series2 = pd.Series((10, 20), index=pd.MultiIndex.from_tuples(((None, None), ("a", "b"))))
df = pd.DataFrame([series1, series2])
#      a   NaN
#    NaN   NaN
# 0  1.0   NaN
# 1  NaN  10.0

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : f346574c8bb8921664f6d47d2e4b0f1252e181b9 python : 3.9.6.final.0 python-bits : 64 OS : Linux OS-release : 5.10.59-1-MANJARO Version : #1 SMP PREEMPT Sun Aug 15 13:11:32 UTC 2021 machine : x86_64 processor : byteorder : little LC_ALL : None LANG : en_US.utf8 LOCALE : en_US.UTF-8 pandas : 0+untagged.1.gf346574 numpy : 1.21.1 pytz : 2021.1 dateutil : 2.8.2 pip : 21.1.3 setuptools : 57.4.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : 7.26.0 pandas_datareader: None bs4 : None bottleneck : None fsspec : None fastparquet : None gcsfs : None matplotlib : 3.4.3 numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None numba : None

Comment From: simonjayhawkins

Thanks @sziem for the report.

There appears to be a change of behaviour here since 1.2.5 although the results on 1.2.5 look iffy.

>>> df = pd.DataFrame([series1, series2])
/home/simon/miniconda3/envs/pandas-1.2.5/lib/python3.9/site-packages/pandas/core/indexes/multi.py:3587: RuntimeWarning: The values in the array are unorderable. Pass `sort=False` to suppress this warning.
  uniq_tuples = lib.fast_unique_multiple([self._values, other._values], sort=sort)
>>> print(df)
  NaN       a
  NaN NaN   b
0   1   1   1
1  10  10  20
>>> 

However the last example does appear to give the correct result with 1.2.5?

>>> df = pd.DataFrame([series1, series2])
/home/simon/miniconda3/envs/pandas-1.2.5/lib/python3.9/site-packages/pandas/core/indexes/multi.py:3587: RuntimeWarning: The values in the array are unorderable. Pass `sort=False` to suppress this warning.
  uniq_tuples = lib.fast_unique_multiple([self._values, other._values], sort=sort)
>>> print(df)
     a   NaN     a
   NaN   NaN     b
0  1.0   NaN   1.0
1  NaN  10.0  20.0
>>> 

will label as a regression for the last case pending further investigation.

Comment From: simonjayhawkins

However the last example does appear to give the correct result with 1.2.5?

maybe not. df.loc[0, ("a", "b")] should be NaN I think.

Comment From: phofl

This is related to #37222

We changed the code paths we are running through for 1.3

Comment From: simonjayhawkins

changing milestone to 1.3.5

Comment From: jreback

good to fix but not for 1.3.x

Comment From: simonjayhawkins

moving off 1.3.x milestone