Code Sample, a copy-pastable example if possible
import pickle
pickle.load(open('./20170130T000000_20170202T000000_done.pkl', 'rb'))
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-2-4ccd85bbbf3f> in <module>()
----> 1 pickle.load(open('./20170130T000000_20170202T000000_done.pkl', 'rb'
2 ))
/Users/qingkaikong/miniconda2/lib/python2.7/pickle.pyc in load(file)
1382
1383 def load(file):
-> 1384 return Unpickler(file).load()
1385
1386 def loads(str):
/Users/qingkaikong/miniconda2/lib/python2.7/pickle.pyc in load(self)
862 while 1:
863 key = read(1)
--> 864 dispatch[key](self)
865 except _Stop, stopinst:
866 return stopinst.value
/Users/qingkaikong/miniconda2/lib/python2.7/pickle.pyc in load_global(self)
1094 module = self.readline()[:-1]
1095 name = self.readline()[:-1]
-> 1096 klass = self.find_class(module, name)
1097 self.append(klass)
1098 dispatch[GLOBAL] = load_global
/Users/qingkaikong/miniconda2/lib/python2.7/pickle.pyc in find_class(self, module, name)
1128 def find_class(self, module, name):
1129 # Subclasses may override this
-> 1130 __import__(module)
1131 mod = sys.modules[module]
1132 klass = getattr(mod, name)
ImportError: No module named index
Problem description
I generated a pickle file with dataframe on one machine using pandas version 0.17.0, but when I try to load the pickle file on another machine with pandas version 0.20.0, I got the above errors. If I use pandas 0.17.0 to open the file, it works fine. I thought it should backward compatible, it seems not.
Currently, the problem is I have many of these pickle files generated by the 0.17.0 version that over a long time, is there a way to address the issue?
Thanks in advance.
Comment From: chris-b1
To read pickle files in a back-compat way, use pd.read_pickle
see doc note (warning box) here
http://pandas.pydata.org/pandas-docs/stable/io.html#pickling
Comment From: qingkaikong
Thanks Chris for help,
I tried to use read_pickle, but I actually stored a list of dataframe, i.e. [df1, df2, df3] into a pickle file, using the read_pickle is not working. Are there any other ways I can work around?
item = "./20170115T000000_20170118T000000_done.pkl"
a = pd.read_pickle(item)
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-2-0fbc5e41d68f> in <module>()
1 item = "./20170115T000000_20170118T000000_done.pkl"
----> 2 a = pd.read_pickle(item)
/Users/qingkaikong/miniconda2/lib/python2.7/site-packages/pandas/io/pickle.pyc in read_pickle(path, compression)
92 lambda f: pc.load(f, encoding=encoding, compat=True))
93 try:
---> 94 return try_read(path)
95 except:
96 if PY3:
/Users/qingkaikong/miniconda2/lib/python2.7/site-packages/pandas/io/pickle.pyc in try_read(path, encoding)
90 except:
91 return read_wrapper(
---> 92 lambda f: pc.load(f, encoding=encoding, compat=True))
93 try:
94 return try_read(path)
/Users/qingkaikong/miniconda2/lib/python2.7/site-packages/pandas/io/pickle.pyc in read_wrapper(func)
66 is_text=False)
67 try:
---> 68 return func(f)
69 finally:
70 for _f in fh:
/Users/qingkaikong/miniconda2/lib/python2.7/site-packages/pandas/io/pickle.pyc in <lambda>(f)
90 except:
91 return read_wrapper(
---> 92 lambda f: pc.load(f, encoding=encoding, compat=True))
93 try:
94 return try_read(path)
/Users/qingkaikong/miniconda2/lib/python2.7/site-packages/pandas/compat/pickle_compat.pyc in load(fh, encoding, compat, is_verbose)
192 up.is_verbose = is_verbose
193
--> 194 return up.load()
195 except:
196 raise
/Users/qingkaikong/miniconda2/lib/python2.7/pickle.pyc in load(self)
862 while 1:
863 key = read(1)
--> 864 dispatch[key](self)
865 except _Stop, stopinst:
866 return stopinst.value
/Users/qingkaikong/miniconda2/lib/python2.7/site-packages/pandas/compat/pickle_compat.pyc in load_reduce(self)
16 func = stack[-1]
17
---> 18 if type(args[0]) is type:
19 n = args[0].__name__ # noqa
20
IndexError: tuple index out of range
Comment From: chris-b1
I'm not particular familiar with the code, but you're welcome to try hacking on the compat code to make it work, may not be that hard - in general our backwards compat stuff applies only to saved frames. https://github.com/pandas-dev/pandas/blob/e437ad594048cc28873df13ccf50cd39a4e88dcb/pandas/compat/pickle_compat.py
If you're often transferring data across pandas version, might consider a more stable serialization format (HDF5, parquet, msgpack)
Comment From: qingkaikong
Thanks Chris, I will try to play with it and hopefully making it work for my case ^)^
I totally agree that using other format is better, currently I am using HDF5 for another project, but due to all the code and workflow using the pickle file, I am lazy to change that. But thanks for the suggestions.