Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [X] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

In [1]: import pandas as pd

In [5]: df = pd.DataFrame(index=pd.Index(["a", "b", "c"], name="custom name"))

In [6]: df.to_parquet("abc")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[6], line 1
----> 1 df.to_parquet("abc")

File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/core/frame.py:2873, in DataFrame.to_parquet(self, path, engine, compression, index, partition_cols, storage_options, **kwargs)
   2786 """
   2787 Write a DataFrame to the binary parquet format.
   2788 
   (...)
   2869 >>> content = f.read()
   2870 """
   2871 from pandas.io.parquet import to_parquet
-> 2873 return to_parquet(
   2874     self,
   2875     path,
   2876     engine,
   2877     compression=compression,
   2878     index=index,
   2879     partition_cols=partition_cols,
   2880     storage_options=storage_options,
   2881     **kwargs,
   2882 )

File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/io/parquet.py:434, in to_parquet(df, path, engine, compression, index, storage_options, partition_cols, **kwargs)
    430 impl = get_engine(engine)
    432 path_or_buf: FilePath | WriteBuffer[bytes] = io.BytesIO() if path is None else path
--> 434 impl.write(
    435     df,
    436     path_or_buf,
    437     compression=compression,
    438     index=index,
    439     partition_cols=partition_cols,
    440     storage_options=storage_options,
    441     **kwargs,
    442 )
    444 if path is None:
    445     assert isinstance(path_or_buf, io.BytesIO)

File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/io/parquet.py:176, in PyArrowImpl.write(self, df, path, compression, index, storage_options, partition_cols, **kwargs)
    166 def write(
    167     self,
    168     df: DataFrame,
   (...)
    174     **kwargs,
    175 ) -> None:
--> 176     self.validate_dataframe(df)
    178     from_pandas_kwargs: dict[str, Any] = {"schema": kwargs.pop("schema", None)}
    179     if index is not None:

File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/io/parquet.py:138, in BaseImpl.validate_dataframe(df)
    136 else:
    137     if df.columns.inferred_type not in {"string", "empty"}:
--> 138         raise ValueError("parquet must have string column names")
    140 # index level names must be strings
    141 valid_names = all(
    142     isinstance(name, str) for name in df.index.names if name is not None
    143 )

ValueError: parquet must have string column names

Issue Description

There seems to be an error when we try to write an empty dataframe with no columns to a parquet file. However passing columns=[] to DataFrame constructor works. Is this an intended behavior?

Expected Behavior

Idealy pd.DataFrame(index=pd.Index(["a", "b", "c"], name="custom name")) has to work. But if pd.DataFrame(index=pd.Index(["a", "b", "c"], name="custom name", columns=[])) is the right way to do it starting pandas 2.0, please let me know.

Installed Versions

In [2]: pd.show_versions() /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.") INSTALLED VERSIONS ------------------ commit : c2a7f1ae753737e589617ebaaff673070036d653 python : 3.10.9.final.0 python-bits : 64 OS : Linux OS-release : 4.15.0-76-generic Version : #86-Ubuntu SMP Fri Jan 17 17:24:28 UTC 2020 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 2.0.0rc1 numpy : 1.23.5 pytz : 2022.7.1 dateutil : 2.8.2 setuptools : 67.6.0 pip : 23.0.1 Cython : 0.29.33 pytest : 7.2.2 hypothesis : 6.70.0 sphinx : 5.3.0 blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.2 IPython : 8.11.0 pandas_datareader: None bs4 : 4.11.2 bottleneck : None brotli : fastparquet : None fsspec : 2023.3.0 gcsfs : None matplotlib : None numba : 0.56.4 numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : 10.0.1 pyreadstat : None pyxlsb : None s3fs : 2023.3.0 scipy : 1.10.1 snappy : sqlalchemy : 1.4.46 tables : None tabulate : 0.9.0 xarray : None xlrd : None zstandard : None tzdata : None qtpy : None pyqt5 : None