I am having an issue opening some old pickle files that I created a couple of years ago (July of 2014). These files were pickled using in the cPickle module in python and I am pretty sure using a version of pandas <=0.12. I was using cPickle to load the files and getting a TypeError until I found the following answers on stack overflow:

http://stackoverflow.com/questions/20444593/pandas-compiled-from-source-default-pickle-behavior-changed http://stackoverflow.com/questions/27950991/pandas-backwards-compatibility-issue-with-pickle-0-14-1-and-0-15-2

The solution is to be to use pd.read_pickle to load these files instead of using the python module cPickle. However when I try using pd.read_pickle I get the following ImportError:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-8-451b9dbea93c> in <module>()
      1 data_dir="J:\Pat's Projects\Dynamical Phase Transition\Mosaic Trajectories"
      2 #fit_params=cPickle.load(open(data_dir+'circle_fitting_params_071614.pkl','r'))
----> 3 expt_data=pd.read_pickle(data_dir+'data_frames_071614.pkl')
      4 #expt_list=cPickle.load(open(data_dir+'expt_list_071614.pkl','r'))

C:\Users\Scherer Lab E\Anaconda2\envs\170112\lib\site-packages\pandas\io\pickle.pyc in read_pickle(path)
     63 
     64     try:
---> 65         return try_read(path)
     66     except:
     67         if PY3:

C:\Users\Scherer Lab E\Anaconda2\envs\170112\lib\site-packages\pandas\io\pickle.pyc in try_read(path, encoding)
     60             except:
     61                 with open(path, 'rb') as fh:
---> 62                     return pc.load(fh, encoding=encoding, compat=True)
     63 
     64     try:

C:\Users\Scherer Lab E\Anaconda2\envs\170112\lib\site-packages\pandas\compat\pickle_compat.pyc in load(fh, encoding, compat, is_verbose)
    115         up.is_verbose = is_verbose
    116 
--> 117         return up.load()
    118     except:
    119         raise

C:\Users\Scherer Lab E\Anaconda2\envs\170112\lib\pickle.pyc in load(self)
    862             while 1:
    863                 key = read(1)
--> 864                 dispatch[key](self)
    865         except _Stop, stopinst:
    866             return stopinst.value

C:\Users\Scherer Lab E\Anaconda2\envs\170112\lib\pickle.pyc in load_global(self)
   1094         module = self.readline()[:-1]
   1095         name = self.readline()[:-1]
-> 1096         klass = self.find_class(module, name)
   1097         self.append(klass)
   1098     dispatch[GLOBAL] = load_global

C:\Users\Scherer Lab E\Anaconda2\envs\170112\lib\pickle.pyc in find_class(self, module, name)
   1128     def find_class(self, module, name):
   1129         # Subclasses may override this
-> 1130         __import__(module)
   1131         mod = sys.modules[module]
   1132         klass = getattr(mod, name)

ImportError: No module named copy_reg

Like I said, I don't know exactly what version of pandas was used when these were pickled but I think it was <=0.12. I am using python 2.7 installed from the Anaconda distribution.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None python: 2.7.12.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 30 Stepping 5, GenuineIntel byteorder: little LC_ALL: None LANG: None

pandas: 0.18.1 nose: 1.3.7 pip: 8.1.2 setuptools: 27.2.0 Cython: 0.24 numpy: 1.11.1 scipy: 0.17.1 statsmodels: None xarray: None IPython: 4.2.0 sphinx: 1.4.1 patsy: 0.4.1 dateutil: 2.5.3 pytz: 2016.4 blosc: None bottleneck: 1.1.0 tables: 3.2.2 numexpr: 2.6.0 matplotlib: 2.0.0 openpyxl: 2.3.2 xlrd: 1.0.0 xlwt: 1.1.2 xlsxwriter: 0.9.2 lxml: 3.6.0 bs4: 4.4.1 html5lib: None httplib2: None apiclient: None sqlalchemy: 1.0.13 pymysql: None psycopg2: None jinja2: 2.8 boto: 2.40.0 pandas_datareader: None

Comment From: jreback

copy_reg is a python built in module, though renamed to copyreg in python 3. so not really sure what your issue is, or how your file was written. pandas should be able to read pickles from >= 0.10.1. Though its an opaque format so no guarantees.

So you should try an older version of pandas to try and read.

Comment From: pfigliozzi

Thanks for the quick reply. I was hoping for a solution that uses my current environment but that is okay. I made an environment with pandas 0.12 and I was able to load the pickled DataFrame with cPickle.load. Note: using pd.read_pickle did not work to load the pickled file for my case when using pandas 0.12. I guess the easiest thing to do to use the data in the newer versions of pandas is to save the DataFrame in a format that can be read by the newer version such as a json. Thanks a lot for the help, I should be good now.

Comment From: jreback

@pfigliozzi yeah, as I said, since pickles are opaque it is a bit hard to figure out what is wrong if the current tools don't work (that is in general why they are also not a good solution for long term storage). Yep, export as csv usually works (though it lacks the dtype info and is not 100% round-tripable), but pretty good, esp since you are reading in to a newer version of pandas.