Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [X] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
df = pd.DataFrame(...)
df.to_sql(...)

Issue Description

Function pandas.DataFrame.to_sql() writes records stored in a DataFrame to a SQL database. Databases supported by SQLAlchemy are supported. This function will call 'sql.py' in 'pandas/pandas/io/sql.py'.

When applied to a database supported by SQLAlchemy, 'sql.py' will use the class 'SQLAlchemyEngine(BaseEngine)', which has this line causing the reported bug: https://github.com/pandas-dev/pandas/blob/main/pandas/io/sql.py#L1392.

Basically, method 'insert_records' handles 'SQLAlchemyError' exceptions. But not all child class of base class 'SQLAlchemyError' has the attribute 'orig'. Doc: https://docs.sqlalchemy.org/en/14/core/exceptions.html

Therefore, adding a checking attribute step will fix this bug.

Expected Behavior

AttributeError: CompileError object has no attribute orig

Installed Versions

'1.5.3'

Comment From: phofl

Can you provide something reproducible?

Comment From: thanhbui221

My OS: macOS Ventura 13.0 MySQL: Ver 8.0.31 for macos13.0 on arm64 (Homebrew) sqlalchemy: 1.4.44

I created the database named ‘pandas_tosql_test’ in the local instance of mySQL server.

Reproducible Example

import pandas as pd
from sqlalchemy import MetaData, Table
from sqlalchemy.dialects import mysql
from sqlalchemy.sql import text
from sqlalchemy.engine import create_engine
# These two test_frame functions are taken directly from 'pandas/tests/io/test_sql.py'
def test_frame1():
    columns = ["index", "A", "B", "C", "D"]
    data = [
        (
            "2000-01-03 00:00:00",
            0.980268513777,
            3.68573087906,
            -0.364216805298,
            -1.15973806169,
        ),
        (
            "2000-01-04 00:00:00",
            1.04791624281,
            -0.0412318367011,
            -0.16181208307,
            0.212549316967,
        ),
        (
            "2000-01-05 00:00:00",
            0.498580885705,
            0.731167677815,
            -0.537677223318,
            1.34627041952,
        ),
        (
            "2000-01-06 00:00:00",
            1.12020151869,
            1.56762092543,
            0.00364077397681,
            0.67525259227,
        ),
    ]
    return pd.DataFrame(data, columns=columns)

def test_frame2():
    columns = ["index", "A", "B", "C", "D", "E"]
    data = [
        (
            "2000-01-03 00:00:00",
            0.980268513777,
            3.68573087906,
            -0.364216805298,
            -1.15973806169,
            123
        ),
        (
            "2000-01-04 00:00:00",
            1.04791624281,
            -0.0412318367011,
            -0.16181208307,
            0.212549316967,
            123
        ),
        (
            "2000-01-05 00:00:00",
            0.498580885705,
            0.731167677815,
            -0.537677223318,
            1.34627041952,
            123
        ),
        (
            "2000-01-06 00:00:00",
            1.12020151869,
            1.56762092543,
            0.00364077397681,
            0.67525259227,
            123
        ),
    ]
    return pd.DataFrame(data, columns=columns)

def create_upsert_method(meta):
    def upsert_method(table, conn, keys, data_iter):
        sql_table = Table(table.name, meta, autoload=True)
        insert_stmt = mysql.insert(sql_table).values([dict(zip(keys, data)) for data in data_iter])
        upsert_stmt = insert_stmt.on_duplicate_key_update({x.name: x for x in insert_stmt.inserted if x.name in keys})
        conn.execute(upsert_stmt)
    return upsert_method

#change user and password according to yours.
user = 'user'
password = 'password'
host = '127.0.0.1'
port = 3306
schema = 'pandas_tosql_test'

engine = create_engine(
    f"mysql+mysqldb://{user}:{password}@{host}:{port}/{schema}")
conn = engine.connect()

# Firstly, create table 'test_frame' in the database 'pandas_tosql_test'.
with conn.begin():
    df1 = test_frame1()
    df1.to_sql('test_frame', con=engine)

# Secondly, upsert the data of another data frame to table 'test_frame'.
with conn.begin():
    df2 = test_frame2()
    meta = MetaData(conn)
    method = create_upsert_method(meta)
    df2.to_sql(name="test_frame", con=conn, if_exists="append", index=False, chunksize=1000, method=method)

#Error: AttributeError: 'CompileError' object has no attribute 'orig'
# The problem is that the schema of the two data frames is different, but above error did not show exactly it. 

Comment From: jomach

same error here