GORM Playground Link

https://github.com/go-gorm/playground/pull/737

Description

UniqueUser is a model which exists unique constraint on name. And its ID is auto-increment. The Code can be as follows:

type UniqueUser struct {
    gorm.Model
    Name string `gorm:"unique;type:varchar(32)"`
    Age  uint
}

When I perform a batch insertion and want to update all fields if the data to be inserted has a unique index conflict. I used the OnConflict{UpdateAll: true} method to implement the above function, but after execution, if there is an update, the returned ID is incorrect. The test code is as follows:

func TestGORM(t *testing.T) {

    addedUsers := []UniqueUser{
        {
            Name: "Alice",
            Age:  18,
        },
        {
            Name: "Bob",
            Age:  22,
        },
        {
            Name: "Cindy",
            Age:  25,
        },
    }

    // Drop TABLE IF EXISTS
    DB.Migrator().DropTable(&UniqueUser{})
    DB.AutoMigrate(&UniqueUser{})

    // create users
    DB.Create(&addedUsers)

    // In this case, I use a new slice, to simulate another business scenario
    // where new data is obtained and needs to be updated
    newUsers := []UniqueUser{}
    // one year later
    for _, user := range addedUsers {
        newUsers = append(newUsers, UniqueUser{
            Name: user.Name,
            Age:  user.Age + 1,
        })
    }

    // update users, use on conflict to update all fields
    DB.Clauses(clause.OnConflict{
        // add Columns to compatible with sqlite
        Columns:   []clause.Column{{Name: "name"}},
        UpdateAll: true,
    }).Create(&newUsers)

    // The bug here is that when using OnConflict{UpdateAll: true} for batch insertion,
    // if there is an update, the returned ID is incorrect

    // In this test case,
    // the expected result is that the second time the insertion is executed,
    // it should actually be updated,
    // so the ID returned by the corresponding object should be the same as the first time.
    for i := range newUsers {
        firseCreateUser := addedUsers[i]
        secondCreateUser := newUsers[i]
        if firseCreateUser.ID != secondCreateUser.ID {
            t.Errorf("Expected ID %d, got %d", firseCreateUser.ID, secondCreateUser.ID)
        }
    }
}

In this test case, the expected result is that the second time the insertion is executed, it should actually be updated, so the ID returned by the corresponding object should be the same as the first time.

But the result is as follows:

GORM_DIALECT=mysql go test
2024/05/18 14:31:04 testing mysql...

2024/05/18 14:31:04 /Users/jpbirdy/Workspaces/go/pkg/mod/gorm.io/driver/mysql@v1.5.2/migrator.go:200
[0.582ms] [rows:0] SET FOREIGN_KEY_CHECKS = 0;

2024/05/18 14:31:04 /Users/jpbirdy/Workspaces/go/pkg/mod/gorm.io/driver/mysql@v1.5.2/migrator.go:203
[9.195ms] [rows:0] DROP TABLE IF EXISTS `unique_users` CASCADE

2024/05/18 14:31:04 /Users/jpbirdy/Workspaces/go/pkg/mod/gorm.io/driver/mysql@v1.5.2/migrator.go:208
[0.651ms] [rows:0] SET FOREIGN_KEY_CHECKS = 1;

2024/05/18 14:31:04 /Users/jpbirdy/Workspaces/go/pkg/mod/gorm.io/driver/mysql@v1.5.2/migrator.go:326
[0.865ms] [rows:-] SELECT DATABASE()

2024/05/18 14:31:04 /Users/jpbirdy/Workspaces/go/pkg/mod/gorm.io/driver/mysql@v1.5.2/migrator.go:329
[2.028ms] [rows:1] SELECT SCHEMA_NAME from Information_schema.SCHEMATA where SCHEMA_NAME LIKE 'gorm%' ORDER BY SCHEMA_NAME='gorm' DESC,SCHEMA_NAME limit 1

2024/05/18 14:31:04 /Users/jpbirdy/Workspaces/go/src/github.com/jpbirdy/playground/main_test.go:33
[3.375ms] [rows:-] SELECT count(*) FROM information_schema.tables WHERE table_schema = 'gorm' AND table_name = 'unique_users' AND table_type = 'BASE TABLE'

2024/05/18 14:31:04 /Users/jpbirdy/Workspaces/go/src/github.com/jpbirdy/playground/main_test.go:33
[24.974ms] [rows:0] CREATE TABLE `unique_users` (`id` bigint unsigned AUTO_INCREMENT,`created_at` datetime(3) NULL,`updated_at` datetime(3) NULL,`deleted_at` datetime(3) NULL,`name` varchar(32),`age` bigint unsigned,PRIMARY KEY (`id`),INDEX `idx_unique_users_deleted_at` (`deleted_at`),CONSTRAINT `uni_unique_users_name` UNIQUE (`name`))

2024/05/18 14:31:04 /Users/jpbirdy/Workspaces/go/src/github.com/jpbirdy/playground/main_test.go:36
[7.194ms] [rows:3] INSERT INTO `unique_users` (`created_at`,`updated_at`,`deleted_at`,`name`,`age`) VALUES ('2024-05-18 14:31:04.428','2024-05-18 14:31:04.428',NULL,'Alice',18),('2024-05-18 14:31:04.428','2024-05-18 14:31:04.428',NULL,'Bob',22),('2024-05-18 14:31:04.428','2024-05-18 14:31:04.428',NULL,'Cindy',25)

2024/05/18 14:31:04 /Users/jpbirdy/Workspaces/go/src/github.com/jpbirdy/playground/main_test.go:54
[5.269ms] [rows:6] INSERT INTO `unique_users` (`created_at`,`updated_at`,`deleted_at`,`name`,`age`) VALUES ('2024-05-18 14:31:04.436','2024-05-18 14:31:04.436',NULL,'Alice',19),('2024-05-18 14:31:04.436','2024-05-18 14:31:04.436',NULL,'Bob',23),('2024-05-18 14:31:04.436','2024-05-18 14:31:04.436',NULL,'Cindy',26) ON DUPLICATE KEY UPDATE `updated_at`='2024-05-18 14:31:04.436',`deleted_at`=VALUES(`deleted_at`),`name`=VALUES(`name`),`age`=VALUES(`age`)
--- FAIL: TestGORM (0.06s)
    main_test.go:67: Expected ID 1, got 3
    main_test.go:67: Expected ID 2, got 4
    main_test.go:67: Expected ID 3, got 5
FAIL
exit status 1
FAIL    gorm.io/playground      1.004s

But I tested it and it works fine on sqlite and postgres.

However, I tried to debug it. The problem may be that Mysql does not support the RETURNING syntax. When GORM handles this situation when inserting data in Mysql, it will use LastInsertID to get the ID, and handle the auto-increment situation through the code. This situation is fine when all data is newly inserted, but when batch insertion, and there may be conflicting updates, the processing of LastInsertID will be problematic. Causing the returned ID to be incorrect.

Comment From: iv-stdoutlabs

Can confirm the bug. This also leads to failures on the persistance of the associations (foreign keys of associations are set incorrectly).

So far the only workaround I found is to manually request updated records, but it does not look generalisable as schemas can be real different and idk how one can figure out one sane general condition for WHERE clause.