Code Sample, a copy-pastable example if possible
reader = pd.read_hdf(filename, 'foo', chunksize=1000)
next(reader)
TypeError: 'TableIterator' object is not an iterator
Problem description
Unless I'm missing something, I would expect something called TableIterator
to be (itself) an iterator. Since this is an analog of df.read_csv()
, I would expect it to behave along the same lines (which is TextFileReader
and works with next()
).
reader.__iter__()
is defined, so for chunk in reader
works. It's just slightly surprising that TextFileReader
defines __next__()
, but TableIterator does not.
Expected Output
next(reader)
should give the dataframe corresponding to that chunk.
(yes I know Iterators and Iterables are slightly different, but in this case I would expect both to be supported)
Output of pd.show_versions()
Comment From: jreback
this is not implemented (well it IS but it doesn't inherit from the proper machinery), see the issue here: https://github.com/pandas-dev/pandas/issues/9496. Its a pretty easy PR actually if you'd like to do this.
Here's a full repro
In [1]: tm.makeMixedDataFrame().to_hdf('foo.h5', 'df', mode='w', format='table')
In [2]: pd.read_hdf('foo.h5')
Out[2]:
A B C D
0 0.0 0.0 foo1 2009-01-01
1 1.0 1.0 foo2 2009-01-02
2 2.0 0.0 foo3 2009-01-05
3 3.0 1.0 foo4 2009-01-06
4 4.0 0.0 foo5 2009-01-07
In [3]: it = pd.read_hdf('foo.h5', chunksize=1)
In [4]: next(it)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-2cdb14c0d4d6> in <module>()
----> 1 next(it)
TypeError: 'TableIterator' object is not an iterator