Code Sample, a copy-pastable example if possible
setup.py
from sys import version_info
import pandas as pd
df = pd.DataFrame()
df.to_pickle('py{}.gz'.format(version_info.major))
test.py
import pandas as pd
pd.read_pickle('py2.gz')
pd.read_pickle('py3.gz')
Test results
kris@home ~/projects/tmp $ ll
total 32
drwxrwxr-x 2 kris kris 4096 Oct 30 20:05 ./
drwxrwxr-x 17 kris kris 4096 Oct 30 19:11 ../
-rw-rw-r-- 1 kris kris 121 Oct 30 20:04 setup.py
-rw-rw-r-- 1 kris kris 70 Oct 30 20:05 test.py
kris@home ~/projects/tmp $ workon pandas2.7
(pandas2.7) kris@home ~/projects/tmp $ python setup.py
(pandas2.7) kris@home ~/projects/tmp $ ll
total 44
drwxrwxr-x 2 kris kris 4096 Oct 30 20:05 ./
drwxrwxr-x 17 kris kris 4096 Oct 30 19:11 ../
-rw-rw-r-- 1 kris kris 337 Oct 30 20:05 py2.gz
-rw-rw-r-- 1 kris kris 121 Oct 30 20:04 setup.py
-rw-rw-r-- 1 kris kris 70 Oct 30 20:05 test.py
(pandas2.7) kris@home ~/projects/tmp $ workon pandas3.6
(pandas3.6) kris@home ~/projects/tmp $ python setup.py
(pandas3.6) kris@home ~/projects/tmp $ ll
total 56
drwxrwxr-x 2 kris kris 4096 Oct 30 20:06 ./
drwxrwxr-x 17 kris kris 4096 Oct 30 19:11 ../
-rw-rw-r-- 1 kris kris 337 Oct 30 20:05 py2.gz
-rw-rw-r-- 1 kris kris 329 Oct 30 20:06 py3.gz
-rw-rw-r-- 1 kris kris 121 Oct 30 20:04 setup.py
-rw-rw-r-- 1 kris kris 70 Oct 30 20:05 test.py
(pandas3.6) kris@home ~/projects/tmp $ python test.py
(pandas3.6) kris@home ~/projects/tmp $ workon pandas2.7
(pandas2.7) kris@home ~/projects/tmp $ python test.py
Traceback (most recent call last):
File "test.py", line 3, in <module>
pd.read_pickle('py3.gz')
File "/home/kris/projects/pandas/pandas/io/pickle.py", line 110, in read_pickle
return try_read(path)
File "/home/kris/projects/pandas/pandas/io/pickle.py", line 108, in try_read
lambda f: pc.load(f, encoding=encoding, compat=True))
File "/home/kris/projects/pandas/pandas/io/pickle.py", line 84, in read_wrapper
return func(f)
File "/home/kris/projects/pandas/pandas/io/pickle.py", line 108, in <lambda>
lambda f: pc.load(f, encoding=encoding, compat=True))
File "/home/kris/projects/pandas/pandas/compat/pickle_compat.py", line 194, in load
return up.load()
File "/usr/lib/python2.7/pickle.py", line 864, in load
dispatch[key](self)
File "/usr/lib/python2.7/pickle.py", line 892, in load_proto
raise ValueError, "unsupported pickle protocol: %d" % proto
ValueError: unsupported pickle protocol: 4
Problem description
Can't pd.read_pickle
objects in pandas
for python2.7
which were pickled using obj.to_pickle
in pandas
for python3
.
Expected Output
Possibility to also pd.read_pickle
objects pickled in python3
version of pandas
, not only from python2.7
.
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 4.10.0-37-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
pandas: 0.21.0rc1+67.g8449ffd.dirty
pytest: 3.2.3
pip: 9.0.1
setuptools: 36.6.0
Cython: None
numpy: 1.13.3
scipy: None
pyarrow: None
xarray: None
IPython: 5.5.0
sphinx: 1.6.5
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0b10
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: Non
Comment From: jreback
by default this doesn't work. see SO: https://stackoverflow.com/questions/29587179/load-pickle-filecomes-from-python3-in-python2
however in pandas 0.21.0, you can pass a protocol parameter to allow this
(pandas) bash-3.2$ ipython
Python 3.6.1 |Continuum Analytics, Inc.| (default, Mar 22 2017, 19:25:17)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.0.0 -- An enhanced Interactive Python. Type '?' for help.
In [3]: df.to_pickle?
Signature: df.to_pickle(path, compression='infer', protocol=4)
Docstring:
Pickle (serialize) object to input file path.
Parameters
----------
path : string
File path
compression : {'infer', 'gzip', 'bz2', 'xz', None}, default 'infer'
a string representing the compression to use in the output file
.. versionadded:: 0.20.0
protocol : int
Int which indicates which protocol should be used by the pickler,
default HIGHEST_PROTOCOL (see [1], paragraph 12.1.2). The possible
values for this parameter depend on the version of Python. For
Python 2.x, possible values are 0, 1, 2. For Python>=3.0, 3 is a
valid value. For Python >= 3.4, 4 is a valid value.A negative value
for the protocol parameter is equivalent to setting its value to
HIGHEST_PROTOCOL.
.. [1] https://docs.python.org/3/library/pickle.html
.. versionadded:: 0.21.0
File: ~/miniconda3/envs/pandas/lib/python3.6/site-packages/pandas/core/generic.py
Type: method
In [4]: df.to_pickle('foo.pkl', protocol=2)
(py2.7) bash-3.2$ ipython
Python 2.7.12 |Continuum Analytics, Inc.| (default, Jul 2 2016, 17:43:17)
Type "copyright", "credits" or "license" for more information.
IPython 5.1.0 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
pd.e
In [1]: pd.read_pickle('foo.pkl')
Out[1]:
Empty DataFrame
Columns: []
Index: []