Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
fn = "some_file.xlsx"
pd.ExcelFile(fn)
Issue Description
Simply calling pd.ExcelFile
causes some, but not all, sheet names to be printed to the console. For example, I have a .xlsx file with 12 sheets. When just calling pd.ExcelFile(fn)
and nothing else, it prints the following.
Sheet 11
Sheet 2
Sheet 4
Sheet 3
It seems like it's debug output but there's no verbose option with ExcelFile and it's odd that only a handful of sheet names are printed and not even in the right order.
Expected Behavior
There's shouldn't be any outputs.
Installed Versions
Comment From: Mani-mk-mk
Thanks for the information @gitgitwhat . I've checked the issue on the same version of pandas as mentioned by you, but I couldn't reproduce the error. Can you please provide more details or clarification on the issue?
As you can see above a object is being displayed. I also attempted to reproduce this error in a .py file but I was only able to obtain an object when printing. I could be wrong but there could be additional information or clarification needed to fully understand the issue.
Comment From: gitgitwhat
There seems to be something about the workbooks themselves as I can't reproduce it either with a brand new file either. Unfortunately I can't share the files in question but they are complicated workbooks with formulas, conditional formatting, comments, sticky rows and columns, different fonts, sizes and colors and other features.
But the same sheet names get printed every time and in the same order. I'll work on trying to figure out what's different between the sheets that get printed and the ones that don't. I'm guessing there is something that the parser is encountering in those particular sheets that it doesn't like.
Comment From: Mani-mk-mk
No worries! I hope you're able to find a solution. Let me know if there's anything I can do to help
Comment From: asishm
it could be openpyxl
doing it - can you try running?
from openpyxl import load_workbook
book = load_workbook('some_file.xlsx')
and see if it reproduces?
Comment From: gitgitwhat
Yes. It does.
Comment From: asishm
then nothing pandas can really do about it - you should report it to openpyxl
Comment From: phofl
Closing as upstream issue, thx for figuring this out