Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [X] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

# with SQL Alchemy 1.4 installed the following example fails:
import pandas as pd
from sqlalchemy import create_engine
con = create_engine('sqlite:///:memory:')
df = pd.DataFrame({'a': [0, 1, 2, 3]})
df.to_sql('test', con)
=> AttributeError: 'Engine' object has no attribute 'cursor'

Issue Description

While Pandas 2.2 Release Notes declare sqlalchemy > 2.0.0 a minimum requirement, sqlalchemy 1.4 is still a widely used and maintained release. This change thus likely breaks a lot of existing code. Upgrading from sqlalchemy 1.4 to 2.0 is not trivial and therefore many projects using both pandas and sqlalchemy will not be able to upgrade to pandas > 2.2 easily.

Expected Behavior

SQLAlchemy 1.4 compatibility should be continued, and 1.4 support should be deprecated before removal.

Installed Versions

sphinx : 7.2.6 blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.3 IPython : 8.20.0 pandas_datareader : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : 4.12.3 bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : None gcsfs : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyreadstat : None python-calamine : None pyxlsb : None s3fs : None scipy : 1.11.4 sqlalchemy : None tables : None tabulate : 0.9.0 xarray : None xlrd : None zstandard : None tzdata : 2023.4 qtpy : None pyqt5 : None

Comment From: miraculixx

Further testing indicates the compatibility issue may only apply to SQLite. For example for MySQL + sqlalchemy 1.4 this still works:

import pandas as pd
from sqlalchemy import create_engine
con = create_engine('mysql+pymysql://root:password@localhost:3306/test')
df = pd.DataFrame({'a': [0, 1, 2, 3]})
df.to_sql('test', con, if_exists='replace')

Running mysql using docker docker run --name=mysql -e MYSQL_ROOT_PASSWORD=password -p 3306:3306 mysql

Comment From: lithomas1

While Pandas 2.2 Release Notes declare sqlalchemy > 2.0.0 a minimum requirement, sqlalchemy 1.4 is still a widely used and maintained release. This change thus likely breaks a lot of existing code. Upgrading from sqlalchemy 1.4 to 2.0 is not trivial and therefore many projects using both pandas and sqlalchemy will not be able to upgrade to pandas > 2.2 easily.

This seems reasonable to me, since it looks like sqlalchemy 1.4 is still getting releases, so it'd probably make sense to support it at least until sqlalchemy 2.1 comes out.

Usually, we bump minimum versions of dependencies once they become around 2 years old (and sqlalchemy 1.4 is already 3 years old), but I think was it's probably fine to make an exception for sqlalchemy this time.

cc @mroeschke

Comment From: amotl

Hi there. FWIW, we are hitting the same error on our CI with SQLAlchemy 1.4 on a 3rd-party SQLAlchemy dialect, see run #7705314750.

  • https://github.com/crate/crate-python/pull/607

Comment From: amotl

@mroeschke said at https://github.com/pandas-dev/pandas/issues/57137#issuecomment-1915754158:

The minimum sqlalchemy version for pandas 2.2 is 2.0. -- https://pandas.pydata.org/docs/whatsnew/v2.2.0.html#increased-minimum-versions-for-dependencies

So, I think it is clear that SQLAlchemy 1.4 will no longer be supported.

Comment From: mroeschke

Since there have been several reports about Sqlalchemy 1.4, I would be open to supporting that again. Testing and prior compatibility code will needed to be added back in a PR.

Comment From: amotl

Thanks Matthew. For us it will be totally fine to have pandas >= 2.2 with sqlalchemy >= 2 only, just to clarify. When others from the community don't have an easy chance to update, for any reasons, I surely also see that the need for backward-compatibility is being expressed.

Comment From: miried

snowflake-sqlalchemy pins sqlalchemy to <2.0.0 (see https://github.com/snowflakedb/snowflake-sqlalchemy/issues/452) which leads to a version conflict with pandas 2.2.

For reference, https://github.com/pandas-dev/pandas/pull/55524 introduced the version bump. @mroeschke does any pandas code require sqlalchemy features only available in 2.0? According to https://www.sqlalchemy.org/download.html, 1.4 still receives critical fixes, last 1.4 release is from Jan 3, 2024: https://pypi.org/project/SQLAlchemy/1.4.51/

Of course, I'd prefer other projects move their requirements up rather than pandas staying on old versions, but maybe in this case it's the simplest way to postpone the version upgrade?

By the way, I found the error message misleading with regards to the issue.

It looks like in https://github.com/pandas-dev/pandas/blob/2.2.x/pandas/io/sql.py#L900 the errors="ignore" suppresses the actual error message that indicates the version conflict, so one ends up with a cryptic error message because the code fails later (pip did not raise the version conflict either, I guess because it's an optional dependency)

Comment From: joshuataylor

snowflake-sqlalchemy also seems to throw this error when you use the Snowflake recommended way to create a connection when using private keys.

I suspect it's the way URL works, as it's a snowflake.sqlalcheny import? from snowflake.sqlalchemy import URL

Not sure if this a Snowflake SQLAlchemy issue or here though, apologies for the noise.

Comment From: lgonzalez-silen

Just to add more context here, Airflow is pinning their newest 2.8 images to pandas < 2.2:

https://github.com/apache/airflow/pull/37748

It does not seem that they are likely to drop their SQLAlchemy 1.4 support soon, since they have a few dependencies that are not SQLAlchemy 2.0 ready (as of December 2023):

https://github.com/apache/airflow/issues/28723

Google Composer released earlier this month their 2.6 version (Airflow 2.6-based), and I found out today when I upgraded that their choice to mix pandas 2.2.0 and SQLAlchemy 1.4.51 created issues running dags for me:

https://issuetracker.google.com/issues/327158030

Comment From: potiuk

FYI: We just updated constraints for 2.8.2 and updated images for Airflow 2.8.2 to keep pandas 2.1.4. That will not prevent the users hitting the same problem if they "just install" airfow from scratch with the new pandas, but if they follow the recommended way of installing airflow with constraints, they will be good: https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html

Comment From: g24swint

Wanted to vote in favor of maintaining backward compatibility. We are also using a snowflake database behind sqlalchemy and are having this problem with the versioning. It would be nice if pandas could issue a deprecation error with potential to override first before fully abandoning support for SA 1.4

Comment From: rgoubet

Also having this issue with a postgresql+psycopg2 connection string

Comment From: mattholy

And also for pyhive

import pandas as pd
from sqlalchemy import create_engine
DW = create_engine("hive://xxxxxxx")
pd.read_sql('select 1=1',con=DW)

I got AttributeError: 'Engine' object has no attribute 'cursor' I'm using pandas 2.2.1 and sqlalchemy 1.4.52. I know that SQLAlchemy 1.4.x is very old, but sadly is, Apache-Airflow only supports SQLAlchemy 1.x

Comment From: miraculixx

I know that SQLAlchemy 1.4.x is very old, but sadly is, Apache-Airflow only supports SQLAlchemy 1.x

SQLAlchemy 1.4 is an officially supported release of sqlalchemy, albeit in maintenance mode now

Comment From: ptallada

Hi,

We are also using SQLAlchemy 1.4 and Pandas, and would greatly appreciate not having to migrate to SQLAlchemy v2.0 :)

Comment From: YarShev

Is there a planned resolution on this?

Comment From: lithomas1

PRs to fix this would be welcome.

Comment From: 1Elaigwu

I know that SQLAlchemy 1.4.x is very old, but sadly is, Apache-Airflow only supports SQLAlchemy 1.x

SQLAlchemy 1.4 is still officially supported

it should support with the upgrade. could not load data into mysql database on airflow. i can load the data when i run the script on standalone using SQLAlchemy 2.0.30. but not SQLAlchemy version 1.4.52. also after installing the newer version i get "ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. apache-airflow 2.9.1 requires sqlalchemy<2.0,>=1.4.36, but you have sqlalchemy 2.0.30 which is incompatible. flask-appbuilder 4.4.1 requires SQLAlchemy<1.5, but you have sqlalchemy 2.0.30 which is incompatible. marshmallow-sqlalchemy 0.28.2 requires SQLAlchemy<2.0,>=1.3.0, but you have sqlalchemy 2.0.30 which is incompatible." at this point i have to use another RDBMS

Comment From: stefanks

Yes, please address this, thanks!

Comment From: ansuff

Registering my interest in this one as we had to downgrade to make our project work.

Comment From: JeremyBrent

bumping

Comment From: milannnnn

Any updates on this one?

Comment From: LuchiLucs

Found the same problem on the official AWS documentation page to handle Redshift commercial product with SQLAlchemy on Python: AWS Redshift

Here the constraints are: Python 3.9 (AWS Glue) SQL Alchemy < 2.0.0 (for the AWS Redshift dialect support)

Comment From: ltang

is there an update on this? SqlAlchemy 1.4 is still widely used for many SQL dialects. Now my workaround is passing the conn.connection.

import pandas as pd
from sqlalchemy import create_engine
# Run a simple test query using SQLAlchemy connection

with engine.connect() as conn:
    df = pd.read_sql_query("SELECT 1", conn.connection)
display(df)

Comment From: swarajban

+1 on this, this is a big issue affecting us as well

Comment From: geovalexis

+1

Comment From: analopez-livongo

+1 here