Problem description
I have several msgpack encoded data frames from old versions of pandas. I have seen the warnings about this format. However, it seems there are some tests that ensure that old versions can still be loaded. For the current master that seems to include also pandas version 0.16. This is a file that was created with 0.16, which cannot be loaded in 0.20.3.
This is what happens when trying to load the file:
In [4]: pd.read_msgpack('/home/languitar/data/tobi-dataset-post-processed/1/armcontrol-features-Combined+hash.msg')
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-4-8e36ed25a239> in <module>()
----> 1 pd.read_msgpack('/home/languitar/data/tobi-dataset-post-processed/1/armcontrol-features-Combined+hash.msg')
/home/languitar/miniconda2/envs/monitoring/lib/python2.7/site-packages/pandas/io/packers.pyc in read_msgpack(path_or_buf, encoding, iterator, **kwargs)
201 if exists:
202 with open(path_or_buf, 'rb') as fh:
--> 203 return read(fh)
204
205 # treat as a binary-like
/home/languitar/miniconda2/envs/monitoring/lib/python2.7/site-packages/pandas/io/packers.pyc in read(fh)
186
187 def read(fh):
--> 188 l = list(unpack(fh, encoding=encoding, **kwargs))
189 if len(l) == 1:
190 return l[0]
pandas/io/msgpack/_unpacker.pyx in pandas.io.msgpack._unpacker.Unpacker.__next__ (pandas/io/msgpack/_unpacker.cpp:5618)()
pandas/io/msgpack/_unpacker.pyx in pandas.io.msgpack._unpacker.Unpacker._unpack (pandas/io/msgpack/_unpacker.cpp:4602)()
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf8 in position 6: invalid start byte
Expected Output
It should load the data without an exception.
Output of pd.show_versions()
for the creating pandas version
Output of pd.show_versions()
for the version trying to load the file
Comment From: jreback
http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#changes-to-msgpack. this is a use-at-your-own risk format. so would gladly take a patch if you can see what the problem is, but there are no guarantees on this.
Comment From: languitar
The table there says that files packed with pre-0.17 / Python 2
should be readable by any
version. This doesn't seem to be true now in this case?
Comment From: jreback
as i said you are on your own here - there was never any guarantees on this format
Comment From: languitar
Sure, but the table in the new documentation somehow implies that this version should work. Maybe the warning could be stressed that the table is only a rough guess.