Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
wb = load_workbook(self._get_report_filename_template())
wb.create_sheet(self.report_range_sheet)
with pd.ExcelWriter(file, engine='openpyxl') as writer:
writer.book = wb # This assignment causes AttributeError: can't set attribute
Issue Description
Assignment to writer.book causes an exception:
Traceback (most recent call last):
File "/app/py_code/Bussiness/Reports/PrehledovyReport.py", line 108, in _export_data
writer.book = wb
AttributeError: can't set attribute
downgrading to 1.4.4 resolves the issue
Expected Behavior
The assignment should work - injecting externally created workbook to the writer
Installed Versions
pd.show_versions() returns None ... presenting the requirements.txt instead
Comment From: phofl
Hi, thanks for your report.
All ExcelWriter
attributes were considered private prior to 1.5, but this was not enforced.
We made some attributes public in 1.5, but put some restrictions on what you can do with them. For example, you can not set a book.
cc @rhshadrach to double check
Comment From: vranoch
Hi, thanks for your report.
All
ExcelWriter
attributes were considered private prior to 1.5, but this was not enforced.We made some attributes public in 1.5, but put some restrictions on what you can do with them. For example, you can not set a book.
cc @rhshadrach to double check
Hi, Thanks for Your reply, but in that case is there any legal way how to "extend" existing workbook? Without anything else, .to_excel() just overwrites given file with completely new content. The used way was to load existing workbook, create the excel writer and inject previosly loaded workbook in it. In this case all existing content of the file was preserved and new content was added.
How to achieve this functionality?
Comment From: phofl
Could you make your example copy and passable?
You get a Workbook object from the respective engine, e.g. openpyxl. This object has methods again that you can use
Comment From: vranoch
The following code should be reproducible (with any existing Excel file):
` import pandas as pd from openpyxl import load_workbook
wb = load_workbook("any_existing_nonempty_workbook.xlsx") wb.create_sheet("range") with pd.ExcelWriter(file, engine='openpyxl') as writer: writer.book = wb `
Comment From: tehunter
I use this feature too to extend existing workbooks:
# pandas 1.4.x; haven't tried it in 1.5 yet
from openpyxl import load_workbook
if (template_path and os.path.normpath(template_path) != os.path.normpath(path)):
self.book = load_workbook(template_path)
self.handles.handle.seek(0)
self.sheets = {name: self.book[name] for name in self.book.sheetnames}
It's similar to using the r+
mode, but it allows you to save to a different file than the template you are loading from.
Comment From: rhshadrach
Thanks for the report!
We made some attributes public in 1.5, but put some restrictions on what you can do with them. For example, you can not set a book.
In implementing this, we deprecated certain attributes because while they were documented as not being public, it was known that many attributes were being used by users. I think that should be the approach here - revert this in 1.5.1 allowing book
to be a settable property.
However the issue with assigning a new book is that the book is stateful. Currently openpyxl doesn't use this state when writing as far as I can tell, but xlsxwriter does.
from xlsxwriter import Workbook
df = pd.DataFrame({'a': [1, 1, 2]})
df2 = pd.DataFrame({'a': [3, 3, 4]})
df.to_excel('test.xlsx', engine='xlsxwriter')
wb = Workbook('test.xlsx')
with pd.ExcelWriter('test2.xlsx', engine='xlsxwriter') as writer:
# This is the equivalent of `writer.book = ...` in pandas 1.4.4
writer._book = wb
df2.to_excel(writer)
This produces test2.xlsx, but it is an empty file. It also overwrites Sheet 1 rather than adding a new sheet to test.xlsx (is this the correct behavior?). This is because unlike openpyxl, xlsxwriter uses the filename within the book when writing. It seems likely to me (but I'm not certain) that we can remove all reliance on state between method calls - i.e. any needed state within a method would be taken directly from the book or the arguments passed to ExcelWriter
. I believe this would enable the setting of book safely, but maybe there are still cases of bad behavior.
Comment From: vranoch
Hi Richard,
generally, the goal is to use existing Excel file as a template and typically add new sheets to it or write new data to existing sheets.
I understand that assigning the whole workbook anytime may be dangerous. I believe that either exposing specific "setter" method that woud correctly handle setting the object with all relevant relations - or even allowing to specify the "template" workbook only within the initialization of the writer (instead of internally setting a new workbook) would do the job.
Thanks Jan
Comment From: phofl
Imo this is something that should be handled on the engine level, not on our side. For example:
book = writer.book
book.add_sheets(...)
This would move the functionality to openpyxl or another engine
Comment From: vranoch
Yes, why not, but is there any way how to initialize the engine (writer.book) with an EXISTING workbook with some content, not a newly created empty one?
Comment From: phofl
What you can do is specifying mode="a"
when creating the ExcelWriter and use an existing Workbook as source, this should enable you to append to an existing workbook.
Comment From: vranoch
So I should first make a copy of the original "template" file and then put this file to the writer with mode="a"?
Comment From: rhshadrach
@phofl - you agree with allowing book to be settable for the time being? My thinking is that this was an unintended API change that should have gone through deprecation.
@vranoch - can you confirm if the workaround of using mode="a"
satisfies your use case?
Comment From: phofl
I would lean to not rolling this back, since they were private and hence we can change them if we want.
If you feel strongly about it, I would be ok with rolling back and adding an immediate deprecation warning, that we can remove in 2.0
Comment From: vranoch
Hi Richard,
Yes, the mode='a'
seems to work:
`def _export_data(self, data, file, date_from, date_to):
with open(self._get_report_filename_template(), "rb") as report_template_file:
self._copy_filelike_to_filelike(report_template_file, file)
with pd.ExcelWriter(file, mode='a') as writer:
wb = writer.book
try:
del wb[self.report_data_sheet]
except Exception:
pass
try:
del wb[self.report_range_sheet]
except Exception:
pass
wb.create_sheet(self.report_range_sheet)
data.to_excel(writer, sheet_name=self.report_data_sheet, startrow=data_row_offset, startcol=data_col_offset, index=False)`
Thanks
Comment From: twoertwein
If we add a setter for book
, it would be good to emit a warning. I think we had multiple issues where people overwrote book and then were surprised that it corrupted their excel file (because pandas uses its internal file handle that it passed to the excel engine in the first place).
Comment From: rhshadrach
I would lean to not rolling this back, since they were private and hence we can change them if we want.
If you feel strongly about it, I would be ok with rolling back and adding an immediate deprecation warning, that we can remove in 2.0
Thanks @phofl. In #45795 I deprecated other attributes because we were aware of their usage; but I wasn't aware users were setting this attribute. Had I been, this would have been deprecated as well. As such, we broken user code, so it seems best to me to revert and deprecate. I've opened #48943.
Comment From: MarcoGorelli
Hey @vranoch
Thanks for having reported this - there's now a release candidate for pandas 2.0.0, which you can install with
mamba install -c conda-forge/label/pandas_rc pandas==2.0.0rc0
or
python -m pip install --upgrade --pre pandas==2.0.0rc0
If you try it out and report bugs, then we can fix them before the final 2.0.0 release