repro: (1) make a psycopg2 conn (2) make a pandas df (3) call df.read_sql_query("SELECT pg_sleep(1000);", conn) (4) kill the terminal
actual: connection on backend is left hanging as idle in transaction
Comment From: ghost
I have get the same problem when i use pd.read_query() , I not close the connection. in my database the connection state is idle in transaction
. How can I make the connection idle?
Comment From: fukac99
any progress on this? I have the same problem
Comment From: TomAugspurger
Why do we think this is a pandas issue? When psycopg2 is used alone, does it correctly terminate the connection when the process is killed?
Comment From: max-sixty
Or even a python issue? If you 'kill the power', how would the transaction be closed?
(not saying this isn't an issue, but need an example where pandas is failing to close a connection)
Comment From: TomAugspurger
Indeed. In theory, psycopg2 could catch SIGINTs and try to abort the transaction. I'm not sure what's actually done.
Comment From: TomAugspurger
The first item in their FAQ is suggestive: http://initd.org/psycopg/docs/faq.html#problems-with-transactions-handling
Comment From: AlbertRothman
I found that pd.read_sql_query("SELECT * from mytable;", conn) leaves the transaction hanging as as "idle in transaction", but using pd.read_sql_query("SELECT * from mytable;", engine) closes the transaction with a ROLLBACK. I was using SQLAlchemy as my engine.
Comment From: jkdba
one solution would be to turn autocommit on the session
conn = pg.connect("...")
conn.set_session(autocommit=True)
# long running operation
# this looks like query executes and transaction commits changing connection state to idle
# next read_sql_query runs some version cursor.fetch() and parses to dataframe
df = pd.read_sql_query(query, con=conn)
the default is false for autocommit which I find odd considering how many other languages and applications out there default to true,
more on autocommit
Warning By default, any query execution, including a simple SELECT will start a transaction: for long-running programs, if no further action is taken, the session will remain “idle in transaction”, an undesirable condition for several reasons (locks are held by the session, tables bloat…). For long lived scripts, either ensure to terminate a transaction as soon as possible or use an autocommit connection.
Comment From: mroeschke
It's not clear this is a pandas issue specifically so closing.