Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
print(f"{pd.isnull([1, 2, 3]) = }")
print(f"{pd.isnull((1, 2, 3)) = }")
Issue Description
It looks like pd.isnull
is treating list
as array-like and tuple
as a scalar
Expected Behavior
I'd expect list
s and tuple
s to be treated similarly by pd.isnull
. Similar to other parts of the API like pd.Series
Installed Versions
INSTALLED VERSIONS
------------------
commit : 2e218d10984e9919f0296931d92ea851c6a6faf5
python : 3.9.15.final.0
python-bits : 64
OS : Darwin
OS-release : 22.3.0
Version : Darwin Kernel Version 22.3.0: Mon Jan 30 20:42:11 PST 2023; root:xnu-8792.81.3~2/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.5.3
numpy : 1.24.0
pytz : 2022.6
dateutil : 2.8.2
setuptools : 59.8.0
pip : 22.3.1
Cython : None
pytest : 7.2.0
hypothesis : None
sphinx : 4.5.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.7.0
pandas_datareader: None
bs4 : 4.11.1
bottleneck : None
brotli :
fastparquet : 2023.2.0
fsspec : 2022.11.0
gcsfs : None
matplotlib : 3.6.2
numba : None
numexpr : 2.8.3
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 11.0.0
pyreadstat : None
pyxlsb : None
s3fs : 2022.11.0
scipy : 1.9.3
snappy :
sqlalchemy : 1.4.46
tables : 3.7.0
tabulate : None
xarray : 2022.9.0
xlrd : None
xlwt : None
zstandard : None
tzdata : None
Comment From: DeaMariaLeon
This function returns a boolean or array-like of bool. I'll keep the "bug" label just in case, but I don't think it is one.
https://pandas.pydata.org/docs/reference/api/pandas.isnull.html
Comment From: jrbourbeau
Thanks @DeaMariaLeon. The thing that seems off to me is pd.isnull
is treating lists as array-like (returning a array-like of bools) and tuples as scalar (returning a bool)
In [1]: import pandas as pd
In [2]: pd.isnull([1, pd.NA, 3])
Out[2]: array([False, True, False])
In [3]: pd.isnull((1, pd.NA, 3))
Out[3]: False
My expectation is that both lists and tuples should be treated as array-like. Though feel free to let me know if that expectation is incorrect
Comment From: DeaMariaLeon
Oh, I see! Thank you for opening an issue. :)
Comment From: phofl
This is an edge case I think.
You can end up with tuples from a MultiIndex for example. In this scenario we want to treat the tuple as a single element, e.g.
df.drop(columns=(1, 2))
treats the tuple as a single element. I think this is similar here although it does not really look intuitive to me either.
Comment From: rhshadrach
Interestingly in a list, tuples are treated as array-like:
obj = [(1.0, 2.0), (1.0, np.nan), (np.nan, 2.0), (np.nan, np.nan)]
print(pd.isnull(obj))
# [[False False]
# [False True]
# [ True False]
# [ True True]]
Comment From: jorisvandenbossche
treating lists as array-like (returning a array-like of bools) and tuples as scalar (returning a bool)
As far as I remember, in the past we made this distinction (in certain places) because tuples can be labels, as Patrick mentioned.
But it's indeed a tricky situation, with easy confusion and corner cases (a quick search for "tuple list label" gives quite some related issues). For example https://github.com/pandas-dev/pandas/issues/43978 for the drop example.
Another example in indexing where the two are distinguished and have different behaviour:
>>> s = pd.Series(range(6), index=pd.MultiIndex.from_product([[1, 2, 3], [1, 2]]))
>>> s.loc[(1, 2)] # tuple is a single label
1
>>> s.loc[[1, 2]] # list is an indexer (in this case for the first level of the MultiIndex)
1 1 0
2 1
2 1 2
2 3
dtype: int64