Pandas version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of pandas.
-
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
# with SQL Alchemy 1.4 installed the following example fails:
import pandas as pd
from sqlalchemy import create_engine
con = create_engine('sqlite:///:memory:')
df = pd.DataFrame({'a': [0, 1, 2, 3]})
df.to_sql('test', con)
=> AttributeError: 'Engine' object has no attribute 'cursor'
Issue Description
While Pandas 2.2 Release Notes declare sqlalchemy > 2.0.0 a minimum requirement, sqlalchemy 1.4 is still a widely used and maintained release. This change thus likely breaks a lot of existing code. Upgrading from sqlalchemy 1.4 to 2.0 is not trivial and therefore many projects using both pandas and sqlalchemy will not be able to upgrade to pandas > 2.2 easily.
Expected Behavior
SQLAlchemy 1.4 compatibility should be continued, and 1.4 support should be deprecated before removal.
Installed Versions
Comment From: miraculixx
Further testing indicates the compatibility issue may only apply to SQLite. For example for MySQL + sqlalchemy 1.4 this still works:
import pandas as pd
from sqlalchemy import create_engine
con = create_engine('mysql+pymysql://root:password@localhost:3306/test')
df = pd.DataFrame({'a': [0, 1, 2, 3]})
df.to_sql('test', con, if_exists='replace')
Running mysql using docker docker run --name=mysql -e MYSQL_ROOT_PASSWORD=password -p 3306:3306 mysql
Comment From: lithomas1
While Pandas 2.2 Release Notes declare sqlalchemy > 2.0.0 a minimum requirement, sqlalchemy 1.4 is still a widely used and maintained release. This change thus likely breaks a lot of existing code. Upgrading from sqlalchemy 1.4 to 2.0 is not trivial and therefore many projects using both pandas and sqlalchemy will not be able to upgrade to pandas > 2.2 easily.
This seems reasonable to me, since it looks like sqlalchemy 1.4 is still getting releases, so it'd probably make sense to support it at least until sqlalchemy 2.1 comes out.
Usually, we bump minimum versions of dependencies once they become around 2 years old (and sqlalchemy 1.4 is already 3 years old), but I think was it's probably fine to make an exception for sqlalchemy this time.
cc @mroeschke
Comment From: amotl
Hi there. FWIW, we are hitting the same error on our CI with SQLAlchemy 1.4 on a 3rd-party SQLAlchemy dialect, see run #7705314750.
- https://github.com/crate/crate-python/pull/607
Comment From: amotl
@mroeschke said at https://github.com/pandas-dev/pandas/issues/57137#issuecomment-1915754158:
The minimum sqlalchemy version for pandas 2.2 is 2.0. -- https://pandas.pydata.org/docs/whatsnew/v2.2.0.html#increased-minimum-versions-for-dependencies
So, I think it is clear that SQLAlchemy 1.4 will no longer be supported.
Comment From: mroeschke
Since there have been several reports about Sqlalchemy 1.4, I would be open to supporting that again. Testing and prior compatibility code will needed to be added back in a PR.
Comment From: amotl
Thanks Matthew. For us it will be totally fine to have pandas >= 2.2 with sqlalchemy >= 2 only, just to clarify. When others from the community don't have an easy chance to update, for any reasons, I surely also see that the need for backward-compatibility is being expressed.
Comment From: miried
snowflake-sqlalchemy pins sqlalchemy to <2.0.0 (see https://github.com/snowflakedb/snowflake-sqlalchemy/issues/452) which leads to a version conflict with pandas 2.2.
For reference, https://github.com/pandas-dev/pandas/pull/55524 introduced the version bump. @mroeschke does any pandas code require sqlalchemy features only available in 2.0? According to https://www.sqlalchemy.org/download.html, 1.4 still receives critical fixes, last 1.4 release is from Jan 3, 2024: https://pypi.org/project/SQLAlchemy/1.4.51/
Of course, I'd prefer other projects move their requirements up rather than pandas staying on old versions, but maybe in this case it's the simplest way to postpone the version upgrade?
By the way, I found the error message misleading with regards to the issue.
It looks like in https://github.com/pandas-dev/pandas/blob/2.2.x/pandas/io/sql.py#L900 the errors="ignore"
suppresses the actual error message that indicates the version conflict, so one ends up with a cryptic error message because the code fails later (pip did not raise the version conflict either, I guess because it's an optional dependency)
Comment From: joshuataylor
snowflake-sqlalchemy also seems to throw this error when you use the Snowflake recommended way to create a connection when using private keys.
I suspect it's the way URL works, as it's a snowflake.sqlalcheny import? from snowflake.sqlalchemy import URL
Not sure if this a Snowflake SQLAlchemy issue or here though, apologies for the noise.
Comment From: lgonzalez-silen
Just to add more context here, Airflow is pinning their newest 2.8 images to pandas < 2.2:
https://github.com/apache/airflow/pull/37748
It does not seem that they are likely to drop their SQLAlchemy 1.4 support soon, since they have a few dependencies that are not SQLAlchemy 2.0 ready (as of December 2023):
https://github.com/apache/airflow/issues/28723
Google Composer released earlier this month their 2.6 version (Airflow 2.6-based), and I found out today when I upgraded that their choice to mix pandas 2.2.0 and SQLAlchemy 1.4.51 created issues running dags for me:
https://issuetracker.google.com/issues/327158030
Comment From: potiuk
FYI: We just updated constraints for 2.8.2 and updated images for Airflow 2.8.2 to keep pandas 2.1.4. That will not prevent the users hitting the same problem if they "just install" airfow from scratch with the new pandas, but if they follow the recommended way of installing airflow with constraints, they will be good: https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html
Comment From: g24swint
Wanted to vote in favor of maintaining backward compatibility. We are also using a snowflake database behind sqlalchemy and are having this problem with the versioning. It would be nice if pandas could issue a deprecation error with potential to override first before fully abandoning support for SA 1.4
Comment From: rgoubet
Also having this issue with a postgresql+psycopg2
connection string
Comment From: mattholy
And also for pyhive
import pandas as pd
from sqlalchemy import create_engine
DW = create_engine("hive://xxxxxxx")
pd.read_sql('select 1=1',con=DW)
I got AttributeError: 'Engine' object has no attribute 'cursor'
I'm using pandas 2.2.1
and sqlalchemy 1.4.52
. I know that SQLAlchemy 1.4.x is very old, but sadly is, Apache-Airflow only supports SQLAlchemy 1.x
Comment From: miraculixx
I know that SQLAlchemy 1.4.x is very old, but sadly is, Apache-Airflow only supports SQLAlchemy 1.x
SQLAlchemy 1.4 is an officially supported release of sqlalchemy, albeit in maintenance mode now
Comment From: ptallada
Hi,
We are also using SQLAlchemy 1.4 and Pandas, and would greatly appreciate not having to migrate to SQLAlchemy v2.0 :)
Comment From: YarShev
Is there a planned resolution on this?
Comment From: lithomas1
PRs to fix this would be welcome.
Comment From: 1Elaigwu
I know that SQLAlchemy 1.4.x is very old, but sadly is, Apache-Airflow only supports SQLAlchemy 1.x
SQLAlchemy 1.4 is still officially supported
it should support with the upgrade. could not load data into mysql database on airflow. i can load the data when i run the script on standalone using SQLAlchemy 2.0.30. but not SQLAlchemy version 1.4.52. also after installing the newer version i get "ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. apache-airflow 2.9.1 requires sqlalchemy<2.0,>=1.4.36, but you have sqlalchemy 2.0.30 which is incompatible. flask-appbuilder 4.4.1 requires SQLAlchemy<1.5, but you have sqlalchemy 2.0.30 which is incompatible. marshmallow-sqlalchemy 0.28.2 requires SQLAlchemy<2.0,>=1.3.0, but you have sqlalchemy 2.0.30 which is incompatible." at this point i have to use another RDBMS
Comment From: stefanks
Yes, please address this, thanks!
Comment From: ansuff
Registering my interest in this one as we had to downgrade to make our project work.
Comment From: JeremyBrent
bumping
Comment From: milannnnn
Any updates on this one?
Comment From: LuchiLucs
Found the same problem on the official AWS documentation page to handle Redshift commercial product with SQLAlchemy on Python: AWS Redshift
Here the constraints are: Python 3.9 (AWS Glue) SQL Alchemy < 2.0.0 (for the AWS Redshift dialect support)
Comment From: ltang
is there an update on this? SqlAlchemy 1.4 is still widely used for many SQL dialects. Now my workaround is passing the conn.connection.
import pandas as pd
from sqlalchemy import create_engine
# Run a simple test query using SQLAlchemy connection
with engine.connect() as conn:
df = pd.read_sql_query("SELECT 1", conn.connection)
display(df)
Comment From: swarajban
+1 on this, this is a big issue affecting us as well
Comment From: geovalexis
+1
Comment From: analopez-livongo
+1 here