Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [X] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
fn = "some_file.xlsx"
pd.ExcelFile(fn)

Issue Description

Simply calling pd.ExcelFile causes some, but not all, sheet names to be printed to the console. For example, I have a .xlsx file with 12 sheets. When just calling pd.ExcelFile(fn) and nothing else, it prints the following.

Sheet 11
Sheet 2
Sheet 4
Sheet 3

It seems like it's debug output but there's no verbose option with ExcelFile and it's odd that only a handful of sheet names are printed and not even in the right order.

Expected Behavior

There's shouldn't be any outputs.

Installed Versions

INSTALLED VERSIONS ------------------ commit : 2e218d10984e9919f0296931d92ea851c6a6faf5 python : 3.9.0.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.19041 machine : AMD64 processor : Intel64 Family 6 Model 142 Stepping 12, GenuineIntel byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : English_United States.1252 pandas : 1.5.3 numpy : 1.23.4 pytz : 2022.7.1 dateutil : 2.8.2 setuptools : 49.2.1 pip : 21.1 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : 2.9.4 jinja2 : None IPython : None pandas_datareader: None bs4 : 4.11.1 bottleneck : None brotli : None fastparquet : None fsspec : None gcsfs : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : 3.1.0 pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : 1.9.3 snappy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None zstandard : None tzdata : None

Comment From: Mani-mk-mk

Thanks for the information @gitgitwhat . I've checked the issue on the same version of pandas as mentioned by you, but I couldn't reproduce the error. Can you please provide more details or clarification on the issue?

Pandas BUG: Simply calling ExcelFile causes some sheet names to be printed

As you can see above a object is being displayed. I also attempted to reproduce this error in a .py file but I was only able to obtain an object when printing. I could be wrong but there could be additional information or clarification needed to fully understand the issue.

Comment From: gitgitwhat

There seems to be something about the workbooks themselves as I can't reproduce it either with a brand new file either. Unfortunately I can't share the files in question but they are complicated workbooks with formulas, conditional formatting, comments, sticky rows and columns, different fonts, sizes and colors and other features.

But the same sheet names get printed every time and in the same order. I'll work on trying to figure out what's different between the sheets that get printed and the ones that don't. I'm guessing there is something that the parser is encountering in those particular sheets that it doesn't like.

Comment From: Mani-mk-mk

No worries! I hope you're able to find a solution. Let me know if there's anything I can do to help

Comment From: asishm

it could be openpyxl doing it - can you try running?

from openpyxl import load_workbook
book = load_workbook('some_file.xlsx')

and see if it reproduces?

Comment From: gitgitwhat

Yes. It does.

Comment From: asishm

then nothing pandas can really do about it - you should report it to openpyxl

Comment From: phofl

Closing as upstream issue, thx for figuring this out