Describe the feature

bulk insert in tx regardless of max_allowed_packet and without specifying batchSize

Motivation

pymysql executemany(self, query, args) is easier to use while gorm CreateInBatches requires a batchSize argument

Related Issues

Comment From: jinzhu

You can set the CreateBatchSize in gorm.Config when initializing GORM. This setting will influence the behavior of the Create method during bulk insert operations.

  1. Effect on the Create Method: When performing bulk inserts using the Create method, GORM will utilize the CreateBatchSize value as the default batch size to efficiently insert the data, especially when dealing with large volumes of records.

  2. No Impact on CreateInBatches: It's important to note that CreateBatchSize does not affect the CreateInBatches method. CreateInBatches always requires you to specify the batch size for each operation explicitly.

Here is an example to illustrate this:

package main

import (
    "gorm.io/driver/mysql"
    "gorm.io/gorm"
)

func main() {
    // Define your database connection string
    dsn := "your_dsn"

    // Initialize GORM with a default CreateBatchSize
    db, err := gorm.Open(mysql.Open(dsn), &gorm.Config{
        CreateBatchSize: 1000, // Default batch size for bulk inserts using Create
    })

    if err != nil {
        // Handle any initialization errors
    }

    // Using Create for bulk inserts will automatically utilize the default CreateBatchSize
    // Example:
    // db.Create(yourDataSlice)  // Uses the default CreateBatchSize for efficient bulk insertion
}

In this configuration, the Create method will use the default CreateBatchSize of 1000 for bulk inserts, simplifying the process and ensuring efficient handling of large data sets. This default setting helps in managing large insert operations without the need for manual batch size specification in each call.

Comment From: albumcover

Though, the batch size argument or config option may be not convenient and efficient enough. The length of the final composed statement can vary depending on the table being used, which may result in a statement that is either too long and exceeds the max_allowed_packet limit, or too short when compared to using executemany.