Pandas Can't create DataFrame from SQLite3 cursor

When I pass cursor as data to DataFrame constructor an error occurs.

cursor = sqlite.execute(sql)
pd.DataFrame(cursor)


/usr/lib/python3/dist-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    255                                          copy=copy)
    256         elif isinstance(data, collections.Iterator):
--> 257             raise TypeError("data argument can't be an iterator")
    258         else:
    259             try:

TypeError: data argument can't be an iterator

But normal generators is accepted

def gen():
    yield (1,2,3)
    yield (4,5,6)
    yield (7,8,9)

pd.DataFrame(gen())
Out[171]: 
   0  1  2
0  1  2  3
1  4  5  6
2  7  8  9

It feels like inconsistence.

Comment From: jreback

what is type(cursor).__mro__

Comment From: hexum

type(cursor).__mro__
(sqlite3.Cursor, object)

Comment From: hexum

It's not a big issue: passing cursor to list constructor results normal list wich is accepted by DataFrame. I just can't understand why iterable is not acepted. Type checking is a bad practice in Python, isn't it? Why just not to check ability to iterate?

hasattr([], "__iter__")

Comment From: jreback

type checking in python is fine. In pandas is actually quite a bit more complicated, because we need to determine, if, for example a list of-lists or list-of scalars are passed, then this is problematic

so an Iterable must have __iter__ AND __len__. A cursor doesn't have this property, while for example range(5) does. (I know that in theory rowcount works for his, but I don't think this is a guaranteed property). So a Cursor should act much like a GeneratorType (and not an Iterator).

Comment From: hexum

I just create a generator with not defined length. And DataFrame accepts it as I expect. I think we should turn off type check. Anyway user may construct infinity generator and pass type checking.

In [20]: def g():
   ....:     for i in range(5):
   ....:         yield [i, i ** 2, i ** 3]
   ....:         

In [21]: DataFrame(g())
Out[21]: 
   0   1   2
0  0   0   0
1  1   1   1
2  2   4   8
3  3   9  27
4  4  16  64

Comment From: jreback

a generator is fine. you have an iterator. you can certainly make any changes you would like. but they would need to pass the test suite as is.

Comment From: hexum

Hmmm. I'll see how to overcome it.

Comment From: ns-cweber

@jreback Out of curiosity, why is a generator fine, but not an iterator? It looks like DataFrame's constructor checks to see if the data argument is a GeneratorType and then wraps it (data = list(data)), but if it's an iterator it raises an exception.

Comment From: jreback

see my comment above and iterator is not sufficient as its not required to have a len

this is an old issue - so don't really remember have a look at the test code for Frame construction

Comment From: ns-cweber

Generator is also not required to have a len. The solution for generators in the DataFrame constructor is to consume it into a list() and from there treat it as though a list was passed as the data arg. The same should work for iters. If you're worried about infinite iterators, then you should be equally worried about infinite generators, unless I'm misunderstanding something. I'll take a look at the tests.

Comment From: TomAugspurger

Duplicate of https://github.com/pandas-dev/pandas/issues/2193

I also think it'll be possible to turn the iterable into a list, just like with a generator.