I am having an issue opening some old pickle files that I created a couple of years ago (July of 2014). These files were pickled using in the cPickle
module in python and I am pretty sure using a version of pandas <=0.12. I was using cPickle
to load the files and getting a TypeError
until I found the following answers on stack overflow:
http://stackoverflow.com/questions/20444593/pandas-compiled-from-source-default-pickle-behavior-changed http://stackoverflow.com/questions/27950991/pandas-backwards-compatibility-issue-with-pickle-0-14-1-and-0-15-2
The solution is to be to use pd.read_pickle
to load these files instead of using the python module cPickle
. However when I try using pd.read_pickle
I get the following ImportError
:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-8-451b9dbea93c> in <module>()
1 data_dir="J:\Pat's Projects\Dynamical Phase Transition\Mosaic Trajectories"
2 #fit_params=cPickle.load(open(data_dir+'circle_fitting_params_071614.pkl','r'))
----> 3 expt_data=pd.read_pickle(data_dir+'data_frames_071614.pkl')
4 #expt_list=cPickle.load(open(data_dir+'expt_list_071614.pkl','r'))
C:\Users\Scherer Lab E\Anaconda2\envs\170112\lib\site-packages\pandas\io\pickle.pyc in read_pickle(path)
63
64 try:
---> 65 return try_read(path)
66 except:
67 if PY3:
C:\Users\Scherer Lab E\Anaconda2\envs\170112\lib\site-packages\pandas\io\pickle.pyc in try_read(path, encoding)
60 except:
61 with open(path, 'rb') as fh:
---> 62 return pc.load(fh, encoding=encoding, compat=True)
63
64 try:
C:\Users\Scherer Lab E\Anaconda2\envs\170112\lib\site-packages\pandas\compat\pickle_compat.pyc in load(fh, encoding, compat, is_verbose)
115 up.is_verbose = is_verbose
116
--> 117 return up.load()
118 except:
119 raise
C:\Users\Scherer Lab E\Anaconda2\envs\170112\lib\pickle.pyc in load(self)
862 while 1:
863 key = read(1)
--> 864 dispatch[key](self)
865 except _Stop, stopinst:
866 return stopinst.value
C:\Users\Scherer Lab E\Anaconda2\envs\170112\lib\pickle.pyc in load_global(self)
1094 module = self.readline()[:-1]
1095 name = self.readline()[:-1]
-> 1096 klass = self.find_class(module, name)
1097 self.append(klass)
1098 dispatch[GLOBAL] = load_global
C:\Users\Scherer Lab E\Anaconda2\envs\170112\lib\pickle.pyc in find_class(self, module, name)
1128 def find_class(self, module, name):
1129 # Subclasses may override this
-> 1130 __import__(module)
1131 mod = sys.modules[module]
1132 klass = getattr(mod, name)
ImportError: No module named copy_reg
Like I said, I don't know exactly what version of pandas was used when these were pickled but I think it was <=0.12. I am using python 2.7 installed from the Anaconda distribution.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 2.7.12.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 30 Stepping 5, GenuineIntel byteorder: little LC_ALL: None LANG: None
pandas: 0.18.1 nose: 1.3.7 pip: 8.1.2 setuptools: 27.2.0 Cython: 0.24 numpy: 1.11.1 scipy: 0.17.1 statsmodels: None xarray: None IPython: 4.2.0 sphinx: 1.4.1 patsy: 0.4.1 dateutil: 2.5.3 pytz: 2016.4 blosc: None bottleneck: 1.1.0 tables: 3.2.2 numexpr: 2.6.0 matplotlib: 2.0.0 openpyxl: 2.3.2 xlrd: 1.0.0 xlwt: 1.1.2 xlsxwriter: 0.9.2 lxml: 3.6.0 bs4: 4.4.1 html5lib: None httplib2: None apiclient: None sqlalchemy: 1.0.13 pymysql: None psycopg2: None jinja2: 2.8 boto: 2.40.0 pandas_datareader: None
Comment From: jreback
copy_reg
is a python built in module, though renamed to copyreg
in python 3. so not really sure what your issue is, or how your file was written. pandas should be able to read pickles from >= 0.10.1. Though its an opaque format so no guarantees.
So you should try an older version of pandas to try and read.
Comment From: pfigliozzi
Thanks for the quick reply. I was hoping for a solution that uses my current environment but that is okay. I made an environment with pandas 0.12 and I was able to load the pickled DataFrame with cPickle.load
. Note: using pd.read_pickle
did not work to load the pickled file for my case when using pandas 0.12. I guess the easiest thing to do to use the data in the newer versions of pandas is to save the DataFrame in a format that can be read by the newer version such as a json. Thanks a lot for the help, I should be good now.
Comment From: jreback
@pfigliozzi yeah, as I said, since pickles are opaque it is a bit hard to figure out what is wrong if the current tools don't work (that is in general why they are also not a good solution for long term storage). Yep, export as csv usually works (though it lacks the dtype info and is not 100% round-tripable), but pretty good, esp since you are reading in to a newer version of pandas.