Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [ ] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

# classic, numpy example
ser = pd.Series([0.0, 1.0, 2.0, 3.0])
other = pd.Series([True, False, True, False])
ser.mask(~ser.isna(), other).dtype  # bool

# extension array example
ser = pd.Series([0.0, 1.0, 2.0, 3.0], dtype=pd.Float64Dtype())
other = pd.Series([True, False, True, False], dtype=pd.BooleanDtype())
ser.mask(~ser.isna(), other).dtype # TypeError: object cannot be converted to FloatingDtype

Issue Description

Hi,

When using mask, in which other completely replaces self, the classic numpy approach works as expected (with resulting object of type bool). In the extension array example however, a TypeError is thrown.

A more complicated case is one in which other does not completely replace self, for example

ser = pd.Series([pd.NA, 1.0, 2.0, 3.0], dtype=pd.Float64Dtype())
other = pd.Series([True, False, True, False], dtype=pd.BooleanDtype())
ser.mask(~ser.isna(), other)

Here in the numpy case, the result is downcast to object, but in the extension array case the same TypeError is thrown. I can see how this could be a little trickier, but this could also downcast to object or (in this example) still hold the pd.NA and return a result of pd.BooleanDtype()

Expected Behavior

ser = pd.Series([0.0, 1.0, 2.0, 3.0], dtype=pd.Float64Dtype())
other = pd.Series([True, False, True, False], dtype=pd.BooleanDtype())
ser.mask(~ser.isna(), other) # would expect to return pd.Series([True, False, True, False], dtype=pd.BooleanDtype()), not a TypeError

Installed Versions

INSTALLED VERSIONS ------------------ commit : 8dab54d6573f7186ff0c3b6364d5e4dd635ff3e7 python : 3.10.8.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.19044 machine : AMD64 processor : Intel64 Family 6 Model 165 Stepping 3, GenuineIntel byteorder : little LC_ALL : en_US.UTF-8 LANG : en_US.UTF-8 LOCALE : English_United Kingdom.1252 pandas : 1.5.2 numpy : 1.23.4 pytz : 2022.1 dateutil : 2.8.2 setuptools : 65.5.0 pip : 22.2.2 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : 4.9.1 html5lib : None pymysql : 1.0.2 psycopg2 : None jinja2 : 3.1.2 IPython : 8.6.0 pandas_datareader: None bs4 : 4.11.1 bottleneck : None brotli : fastparquet : None fsspec : None gcsfs : None matplotlib : 3.5.3 numba : None numexpr : None odfpy : None openpyxl : 3.0.10 pandas_gbq : None pyarrow : 9.0.0 pyreadstat : None pyxlsb : None s3fs : None scipy : None snappy : None sqlalchemy : 1.4.39 tables : None tabulate : None xarray : None xlrd : None xlwt : None zstandard : None tzdata : None

Comment From: phofl

Hi, thanks for your report. this works on main, but keeps float dtype, which is what I would expect here. But probably needs tests

Comment From: ShashwatAgrawal20

hey! @phofl I would like to work on this but don't know where to get started, it would be a great help if you can guide me from where can I get started

Comment From: ShashwatAgrawal20

take

Comment From: ShashwatAgrawal20

ser = pd.Series([pd.NA, 1.0, 2.0, 3.0], dtype=pd.Float64Dtype())
other = pd.Series([True, False, True, False], dtype=pd.BooleanDtype())
ser.mask(~ser.isna(), other)

The above might have shown an error as the mask method expects a series or scalar values with the same datatype as the original series for other parameters.

Might prevent this error by casting the other or the ser Series to the same as the original Series

ser = pd.Series([0.0, 1.0, 2.0, 3.0], dtype=pd.Float64Dtype())
other = pd.Series([True, False, True, False], dtype=pd.BooleanDtype())
ser = ser.astype(other.dtype)
result = ser.mask(~ser.isna(), other)
print(result)

Output

Pandas BUG: TypeError when mixing extension array dtypes in mask

Comment From: luke396

take