I am so used to manipulating data with pandas that I don't want to fall back to SQL unless absolutely necessary. In fact, I'd rather not use SQL. Nonetheless SQL works better for out-of-core size data (obviously, that is not the primary focus of pandas, and yes I am aware of Dask).

Would it be possible to use pandas on data that resides in an SQL database without importing it in memory. I am not asking whether it would be possible to access the data, but whether it would be possible to use pandas as backend to SQL. For instance: I write pandas code and SQL is executed in the background. I want to count all the values in a column, but I write .value_counts(). Do you see what I mean?

Does that make any sense? Apologies if my question is off-topic or naive?

Comment From: jorisvandenbossche

I think you want to take a look at the ibis or blaze projects, which both have a pandas-like python API that translate to an underlying storage system (decoupling expression from actual computation)