I noticed xlsx files I was exporting from Pandas with to_excel weren't opening in Numbers and Preview on my Mac. They did, however, open in Excel and Google Drive.

I created a new virtualenv and installed Pandas and openpyxl with pip:

numpy==1.8.1
openpyxl==1.8.6
pandas==0.13.1
python-dateutil==2.2
pytz==2014.2
six==1.6.1
wsgiref==0.1.2

Running this code creates a xlsx file that seems to be ok (opens in Numbers anyway):

>>> import pandas as pd
>>> df = pd.DataFrame.from_records([[1,2],[3,4]], columns=["a","b"])
>>> df.to_excel("without_lxml.xlsx")

However, this same code was creating ill-formed xlsx files in my project. I narrowed the culprit down to lxml. I installed lxml in this fresh venv so pip freeze gives:

lxml==3.3.5
numpy==1.8.1
openpyxl==1.8.6
pandas==0.13.1
python-dateutil==2.2
pytz==2014.2
six==1.6.1
wsgiref==0.1.2

At that point, I ran the same code and the new file (with_lxml.xlsx) doesn't open in Numbers or Preview.

>>> import pandas as pd
>>> df = pd.DataFrame.from_records([[1,2],[3,4]], columns=["a","b"])
>>> df.to_excel("with_lxml.xlsx")

You can see the Excel files created: - http://stiglerimages.s3.amazonaws.com/dropzone/without_lxml.xlsx - http://stiglerimages.s3.amazonaws.com/dropzone/with_lxml.xlsx

These files can be unzipped and inspected for possible explanations.

Comment From: jreback

hmm...can you try with lxml 3.2.1 (its what we test with)

Comment From: tdhopper

Seems to work with 3.2.1, however to_excel generates this warning:

/Users/tdhopper/.virtualenvs/temp/lib/python2.7/site-packages/openpyxl/__init__.py:31: UserWarning: The installed version of lxml is too old to be used with openpyxl
  warnings.warn("The installed version of lxml is too old to be used with openpyxl")

Comment From: cpcloud

@jreback how are the wheels working these days? is there a server that i can log into and build things? i'm happy to set this up if you can't do it.

Comment From: cpcloud

let me back up a bit....should we build with a newer version of lxml

Comment From: jreback

easy enough to do

easy enough....you can certainly create them...i'll have to send you login info to upload separately

see pandas/ci/speedpack it works out pretty well....

Comment From: cpcloud

ok so same as before ...

Comment From: jreback

yep...then they live here: http://pandas.pydata.org/pandas-build/dev/wheels/ (which is served by the pandas.pydata.org server)

Comment From: jreback

@cpcloud I emailed you the machine/pw...

Comment From: jreback

@tdhopper ok...seems that something changed in newer version of lxml....want to debug this and do a pull-request?

Comment From: tdhopper

Don't have the time or understanding of the internals to work on it at the moment. I'll see if I can take a look over the next few weeks if it doesn't get fix before that.

Comment From: jreback

@tdhopper that's fine; we are going to be release 0.14.0 shortly anyhow. marking for 0.15.

Comment From: jmcnamara

I don't know the full details here but from my reading of the OpenPyXL mailing list I think what is happening is: - OpenPyXL recently added support for lxml as an additional backend for xml handling. - The OpenPyXL files produced with lxml installed have a different namespace (or a different arrangement of namespaces). In the files you attached the only real difference is that the relationship namespace is attached to the sheet element rather than the workbook element. - The result of this is that the files produced aren't compatible with third-party applications such as Numbers.

There was a similar issue reported on the OpenPyXL mailing list where the reply was that the file produced was a valid Excel OpenXML file according to the specification.

So, this isn't an issue in Pandas. You could raise it as an issue with the OpenPyXL devs but it looks like for now it is the accepted behaviour.

As a workaround you could use xlsxwriter as the Excel output engine:

df.to_excel("with_lxml.xlsx", engine='xlsxwriter')

Comment From: jreback

@cpcloud

will leave this open as a marker for building the wheel for newer version of lxml

Comment From: cpcloud

cool

Comment From: cpcloud

@jreback i'm building the local wheels now .... had to reinstall os awhile back and now i'm paying the pandaculous price

Comment From: tdhopper

Thanks @jmcnamara