MyBatis-Plus 分布式高并发下雪花算法id主键重复

当前使用版本 3.5.6

当前环境信息 Java8 + Mysql5.7

描述bug现象 在k8s集群下,TYPE=ASSING_ID（雪花算法），id主键冲突。具体报错信息为： thread t = Thread[adjustAngular-pool-22-thread-21,5,main], Throwable =
java.lang.RuntimeException: org.springframework.dao.DuplicateKeyException:

Error updating database. Cause: java.sql.SQLIntegrityConstraintViolationException: Duplicate entry '1785223470585921538' for key 'PRIMARY'

The error may exist in com/alsc/operation/asset/mapper/AstMediaAngularAdjustMapper.java (best guess)

The error may involve com.alsc.operation.asset.mapper.AstMediaAngularAdjustMapper.insert-Inline

The error occurred while setting parameters

Cause: java.sql.SQLIntegrityConstraintViolationException: Duplicate entry '1785223470585921538' for key 'PRIMARY'

at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:242)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:72)
at org.mybatis.spring.MyBatisExceptionTranslator.translateExceptionIfPossible(MyBatisExceptionTranslator.java:88)
at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:440)
at com.sun.proxy.$Proxy159.insert(Unknown Source)
at org.mybatis.spring.SqlSessionTemplate.insert(SqlSessionTemplate.java:271)
at com.baomidou.mybatisplus.core.override.MybatisMapperMethod.execute(MybatisMapperMethod.java:60)
at com.baomidou.mybatisplus.core.override.MybatisMapperProxy.invoke(MybatisMapperProxy.java:96)
at com.sun.proxy.$Proxy193.insert(Unknown Source)
at com.baomidou.mybatisplus.extension.service.IService.save(IService.java:59)
at com.baomidou.mybatisplus.extension.service.IService$$FastClassBySpringCGLIB$$f8525d18.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:684)
at com.alsc.operation.asset.service.impl.AstMediaAngularAdjustServiceImpl$$EnhancerBySpringCGLIB$$e27546bc.save(<generated>)
at com.alsc.operation.asset.provider.impl.AstMediaAngularAdjustProviderImpl.save(AstMediaAngularAdjustProviderImpl.java:28)
at com.alsc.operation.asset.provider.impl.AstMediaAngularAdjustProviderImpl$$FastClassBySpringCGLIB$$11be5b02.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:746)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:88)

该问题是如何引起的？ k8s集群下，每个pod的进程号均为1（即com.baomidou.mybatisplus.core.toolkit.Sequence.getMaxWorkerId(long datacenterId, long maxWorkerId)中的name.split(StringPool.AT)[0]取值均为1）按com.baomidou.mybatisplus.core.toolkit.Sequence.getDatacenterId()的算法，模32之后可能会相同，导致以相同的datacenterId来计算workId也会相同。本来有两个计算因子（datacenterId和workId）保证并发时唯一性，只剩下一个计算因子（datacenterId）来保证并发时的唯一性。

上述的报错，查询到6个pod的mac地址分别为 1、46:96:98:fd:68:f7 2、a6:59:0d:2b:11:f0 3、06:d0:6f:1f:96:f8 4、8e:a5:66:95:e6:c5 5、42:68:2c:4b:c9:69 6、f6:c4:86:92:55:e6

按getDatacenterId()的算法，%32之前和%32之后的值分别为： mac地址为46:96:98:fd:68:f7,模32之前的id=419,计算后的datecenterid=3,workid=30 mac地址为a6:59:0d:2b:11:f0,模32之前的id=71,计算后的datecenterid=7,workid=26 mac地址为06:d0:6f:1f:96:f8,模32之前的id=603,计算后的datecenterid=27,workid=12 mac地址为8e:a5:66:95:e6:c5,模32之前的id=923,计算后的datecenterid=27,workid=12 mac地址为42:68:2c:4b:c9:69,模32之前的id=805,计算后的datecenterid=5,workid=28 mac地址为f6:c4:86:92:55:e6,模32之前的id=343,计算后的datecenterid=23,workid=16

可以看到有两个pod是相同的，datecenterid=27,workid=12。当并发在这两个pod上的id生成，即会产生重复id，其中一个pod会保存成功，另外一个pod报上述错误。

建议将com.baomidou.mybatisplus.core.toolkit.Sequence.getMaxWorkerId（）中的 mpid.append(name.split(StringPool.AT)[0]); 修改为 mpid.append(name);

由于getMaxWorkerId()最终结果也是模32出来的（0-31之间的数），还是有一定概率存在datecenterid和workid同时相同，但概率非常低了。

Comment From: Jick-study

建议对于这种唯一id直接使用hutool工具类吧。

Comment From: zzwyad

建议对于这种唯一id直接使用hutool工具类吧。

查看了hutool-core-5.8.26.jar的源码，其cn.hutool.core.lang.Pid中的getPid()算法是一样的，也存在相同的问题： private static int getPid() throws UtilException { final String processName = ManagementFactory.getRuntimeMXBean().getName(); if (StrUtil.isBlank(processName)) { throw new UtilException("Process name is blank!"); } final int atIndex = processName.indexOf('@'); if (atIndex > 0) { return Integer.parseInt(processName.substring(0, atIndex)); } else { return processName.hashCode(); } } 这里的atIndex =processName.indexOf(“@”); 对于k8s不同的pod,返回值均为1。

Comment From: Honglonglong258

遇到同样问题

Comment From: life-

The same problem I have seen in my project!

Could I proposal pr for this problem.

Comment From: NotoChen

使用雪花算法, 无论是mp自带的com.baomidou.mybatisplus.core.toolkit.Sequence还是hutool的cn.hutool.core.lang.Snowflake

都提供了有参构造器public xxx(long workerId, long dataCenterId)

如果存在分布式高并发场景, 就非常有必要使用mp的com.baomidou.mybatisplus.core.incrementer.IdentifierGenerator

通过自定义ID生成器, 以及通过配置来给每个应用自定义workerId, dataCenterId, 从而构造每个应用独有的雪花算法

或者另外的实现方式, 比如通过Redis自增长, workerId 每次获取自动+1, 如果>31 则 dataCenterId + 1, 这样就有了32*32的独有雪花算法

如果机器数超过该数值, 可以再去改造雪花算法

Use the snowflake algorithm, either the mp's own com.baomidou.mybatisplus.core.toolkit.Sequence or hutool's cn.hutool.core.lang.Snowflake

Both provide parameter constructors public xxx(long workerId, long dataCenterId)

If there is a distributed high concurrency scenario, it is necessary to use the mp com.baomidou.mybatisplus.core.incrementer.IdentifierGenerator

Through the custom ID generator, and through the configuration to customize the workerId and dataCenterId for each application, so as to construct a unique snowflake algorithm for each application

Or another implementation, e.g. via Redis auto-increment, workerId automatically +1 every time it gets, if >31 then dataCenterId + 1, In this way, there is a unique snowflake algorithm of 32*32

If the number of machines exceeds this value, the snowflake algorithm can be modified

Comment From: tanyaofei

可以考虑使用 statefulset，然后通过环境变量可以取到当前的 pod 名, 这其中包含了序号如 app-0, app-1，通过这个值作为 workerid 来构建自己的 Sequence

Comment From: LusiferCoder

我建议可以采用美团 leaf来做分布式主键id，将主键设置为 @TableId(value = "id", type = IdType.INPUT)

Comment From: uncarbon97

啊？你们不是根据NodeIP取模确定workerID的吗？如果机器数量太多，还可以用MurMur3散列NodeIP，再取模

Comment From: Tomatocicier

这个不是mp的问题

Comment From: qmdx

使用雪花算法, 无论是mp自带的com.baomidou.mybatisplus.core.toolkit.Sequence还是hutool的cn.hutool.core.lang.Snowflake

都提供了有参构造器public xxx(long workerId, long dataCenterId)

如果存在分布式高并发场景, 就非常有必要使用mp的com.baomidou.mybatisplus.core.incrementer.IdentifierGenerator

通过自定义ID生成器, 以及通过配置来给每个应用自定义workerId, dataCenterId, 从而构造每个应用独有的雪花算法

或者另外的实现方式, 比如通过Redis自增长, workerId 每次获取自动+1, 如果>31 则 dataCenterId + 1, 这样就有了32*32的独有雪花算法

如果机器数超过该数值, 可以再去改造雪花算法

Use the snowflake algorithm, either the mp's own com.baomidou.mybatisplus.core.toolkit.Sequence or hutool's cn.hutool.core.lang.Snowflake

Both provide parameter constructors public xxx(long workerId, long dataCenterId)

If there is a distributed high concurrency scenario, it is necessary to use the mp com.baomidou.mybatisplus.core.incrementer.IdentifierGenerator

Through the custom ID generator, and through the configuration to customize the workerId and dataCenterId for each application, so as to construct a unique snowflake algorithm for each application

Or another implementation, e.g. via Redis auto-increment, workerId automatically +1 every time it gets, if >31 then dataCenterId + 1, In this way, there is a unique snowflake algorithm of 32*32

If the number of machines exceeds this value, the snowflake algorithm can be modified

正解，同一台机器出现多实例并发特别是 docker 环境建议设置 datacenterId 和 workId 参数解决这个问题，或者使用美团 leaf 主动分配方案解决

Comment From: llllllxy

使用雪花算法, 无论是mp自带的com.baomidou.mybatisplus.core.toolkit.Sequence还是hutool的cn.hutool.core.lang.Snowflake

都提供了有参构造器public xxx(long workerId, long dataCenterId)

如果存在分布式高并发场景, 就非常有必要使用mp的com.baomidou.mybatisplus.core.incrementer.IdentifierGenerator

通过自定义ID生成器, 以及通过配置来给每个应用自定义workerId, dataCenterId, 从而构造每个应用独有的雪花算法

或者另外的实现方式, 比如通过Redis自增长, workerId 每次获取自动+1, 如果>31 则 dataCenterId + 1, 这样就有了32*32的独有雪花算法

如果机器数超过该数值, 可以再去改造雪花算法

Use the snowflake algorithm, either the mp's own com.baomidou.mybatisplus.core.toolkit.Sequence or hutool's cn.hutool.core.lang.Snowflake

Both provide parameter constructors public xxx(long workerId, long dataCenterId)

If there is a distributed high concurrency scenario, it is necessary to use the mp com.baomidou.mybatisplus.core.incrementer.IdentifierGenerator

Through the custom ID generator, and through the configuration to customize the workerId and dataCenterId for each application, so as to construct a unique snowflake algorithm for each application

Or another implementation, e.g. via Redis auto-increment, workerId automatically +1 every time it gets, if >31 then dataCenterId + 1, In this way, there is a unique snowflake algorithm of 32*32

If the number of machines exceeds this value, the snowflake algorithm can be modified

这样在服务重启后，会浪费掉一个 workerId and dataCenterId