Your Question

I have a database deadlock problem when I parallel upsert some data, how can I solve it?

Examples are as follows 1. database

CREATE TABLE `example_table` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT 'id',
  `column1` VARCHAR(100) NOT NULL DEFAULT '',
  `column2` VARCHAR(50) NULL DEFAULT '' ,
  `column3` VARCHAR(150) NULL DEFAULT '' ,
  `column4` VARCHAR(10) NULL DEFAULT '' ,
  `column5` TEXT NULL DEFAULT NULL COMMENT '' ,
  `deleted_at` bigint(20) UNSIGNED NOT NULL DEFAULT '0' COMMENT '删除时间',
  PRIMARY KEY (`id`),
  UNIQUE KEY `udx_column1_column2_column3_deleted_at` (`column1`,`column2`,`column3`,`deleted_at`),
  KEY `idx_deleted_at` (`deleted_at`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='example_table';

CREATE TABLE `other_table` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT 'id',
  `column1` VARCHAR(100) NOT NULL DEFAULT '',
  `column7` VARCHAR(10) NULL DEFAULT '' ,
  `column8` TEXT NULL DEFAULT NULL COMMENT '' ,
  PRIMARY KEY (`id`),
  UNIQUE KEY `udx_column1` (`column1`),
  KEY `idx_deleted_at` (`deleted_at`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='other_table';
  1. gorm code
// Update if exist, create if not .
func (e *example) BatchUpsert(ctx context.Context, es []*model.Example) error {
    return e.db.Clauses(clause.OnConflict{
        Columns: []clause.Column{
            {Name: "column1"},
            {Name: "column2"},
            {Name: "column3"},
        },
        UpdateAll: true,
    }).Create(&es).Error
}

3.Concurrent code Improves efficiency by splitting data for parallel processing, but causes database deadlocks Parse the data in column8 of other_table into column2,column3,column4,column5 of example_table

const (
    DataSize    = 100
    DataPerTask = 10
)
type Task struct {
    index int
    datas  []model.OtherData
    sum   int
    wg    *sync.WaitGroup
}

func handler(string) []model.Example(){}

func (t *Task) Do() {
    for _, d := range t.datas {
        // after some hander,  get example_table slice data
        examples := handler(d.column8)

        if err := example.BatchUpsert(ctx, examples);err!=nil{
            log.Errorf(err)
        }

    }
    t.wg.Done()
}
taskFunc := func(data interface{}) {
    task := data.(*Task)
    task.Do()
    fmt.Printf("task:%d sum:%d\n", task.index, task.sum)
}

// length=100, from other get 
nums:=[100]*model.OtherData

p, _ := ants.NewPoolWithFunc(10, taskFunc)
defer p.Release()
var wg sync.WaitGroup
wg.Add(DataSize / DataPerTask)
tasks := make([]*Task, 0, DataSize/DataPerTask)
for i := 0; i < DataSize/DataPerTask; i++ {
    task := &Task{
        index: i + 1,
        nums:  nums[i*DataPerTask : (i+1)*DataPerTask],
        wg:    &wg,
    }
    tasks = append(tasks, task)
    p.Invoke(task)
}
wg.Wait()

The document you expected this should be explained

It stands to reason that the other_data I get has a unique index on column1, and that column1 is also used as part of the unique index in the example_table. There should be no association between the upsert data, but there is a database deadlock problem

  • error
Deadlock found when trying to get lock; try restarting transaction
[11.305ms] [rows:0] INSERT INTO `example_table` (`column1`,`column2`,`column3`,`deleted_at`,`column4`,`column5`) VALUES (),(),(),(),()... ON DUPLICATE KEY UPDATE `column1`=VALUES(`column1`),`column2`=VALUES(`column2`),`column3`=VALUES(`column3`),`deleted_at`=VALUES(`deleted_at`),`column4`=VALUES(`column4`),`column5`=VALUES(`column5`)

Expected answer

Is it because I'm using upsert, or is it the wrong version, or something else? Looking forward to the reply

go version: 1.16 gorm.io/driver/mysql v1.3.2 gorm.io/gorm v1.23.2 gorm.io/plugin/soft_delete v1.2.0

Comment From: a631807682

Gorm has nothing to do with database deadlocks, whether deadlocks occur only depends on how you use database queries, deadlocks were just (perhaps annoying) symptoms of much bigger problem, you should check your query statement and understand its execution process. refer to https://bugs.mysql.com/bug.php?id=98324