Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [ ] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
from pandas.testing import assert_series_equal

# non-empty
df = pd.DataFrame([1], columns=["col1"])
res1 = df.apply(lambda r: r.col1, axis=1)  # series
res2 = df.apply(lambda r: int(r.col1), axis=1)  # series
assert_series_equal(res1, res2)
# -> OK!

# empty
df_empty = pd.DataFrame(columns=["col1"])
res3 = df_empty.apply(lambda r: r.col1, axis=1)  # empty series
res4 = df_empty.apply(lambda r: int(r.col1), axis=1)  # empty dataframe
assert_series_equal(res3, res4)
# -> AssertionError: Series Expected type <class 'pandas.core.series.Series'>,
#    found <class 'pandas.core.frame.DataFrame'> instead

# Some use-case where you need to create a new series
pd.Series(res3)  # Works as expected
pd.Series(res4)  # Raises
# -> ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Issue Description

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/.../python3.9/site-packages/pandas/core/series.py", line 386, in __init__
    if is_empty_data(data) and dtype is None:
  File "/../python3.9/site-packages/pandas/core/construction.py", line 877, in is_empty_data
    is_simple_empty = is_list_like_without_dtype and not data
  File "/.../python3.9/site-packages/pandas/core/generic.py", line 1527, in __nonzero__
    raise ValueError(
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Expected Behavior

I would expect the result to be consistent, i.e. always return a series.

Installed Versions

commit : 8dab54d6573f7186ff0c3b6364d5e4dd635ff3e7 python : 3.9.6.final.0 python-bits : 64 OS : Darwin OS-release : 21.6.0 Version : Darwin Kernel Version 21.6.0: Mon Aug 22 20:17:10 PDT 2022; root:xnu-8020.140.49~2/RELEASE_X86_64 machine : x86_64 processor : i386 byteorder : little LC_ALL : None LANG : None LOCALE : None.UTF-8 pandas : 1.5.2 numpy : 1.23.4 pytz : 2022.7 dateutil : 2.8.1 setuptools : 56.0.0 pip : 22.3.1 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : 4.9.1 html5lib : None pymysql : None psycopg2 : 2.9.1 jinja2 : 3.1.2 IPython : None pandas_datareader: None bs4 : 4.11.1 bottleneck : None brotli : None fastparquet : None fsspec : 2022.11.0 gcsfs : None matplotlib : 3.6.2 numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : 1.9.3 snappy : None sqlalchemy : None tables : None tabulate : 0.9.0 xarray : None xlrd : None xlwt : None zstandard : None tzdata : 2022.6

Comment From: rhshadrach

Thanks for the report; this particular code path attempts to discern what shape to return by calling the passed function (your lambda, in this case) and using the result. Here, int(r.col1) fails when the DataFrame is empty. With no information about what the function does to the input, we have no way of determining what shape (e.g. Series vs DataFrame) to return. In such a scenario we return the input.

I think this is a duplicate of #47959.

Comment From: adriantre

I see, that makes sense. The easiest fix will then be to always check for empty frames, and specify a desired return value/type with an early return. I find that we do this a lot, but that is maybe too much to ask of the general user.

Could this be a kwarg to the apply function? df.apply(my_func, axis=1, result_when_empty=pd.Series(...))

The result_when_empty should allow for templating columns when it is a pd.DataFrame.

Comment From: rhshadrach

I added some thoughts on this in https://github.com/pandas-dev/pandas/issues/47959#issuecomment-1382745734; closing here to keep the discussion consolidated. Please join the discussion there!