为什么mysql innodb在hangfire并行处理多个作业时会产生这么多死锁?

2lpgd968  于 2021-06-15  发布在  Mysql
关注(0)|答案(1)|浏览(558)

在我的asp.net核心应用程序中,我将hangfire与mysql数据库存储结合使用。我有一个端点,当访问它时,它会在后台安排一个hangfire作业。当我对这个端点进行负载测试时,如果我发送超过40个并发请求,这个代码 BackgroundJob.Schedule<IJobSchedulerCallbacks>(s => s.ScheduleSomeCode(); 开始引发以下异常:

Hangfire.BackgroundJobClientException: Background job creation failed. See inner exception for details. ---> MySql.Data.MySqlClient.MySqlException: Deadlock found when trying to get lock; try restarting transaction
at MySql.Data.MySqlClient.MySqlStream.ReadPacket()
at MySql.Data.MySqlClient.NativeDriver.GetResult(Int32& affectedRow, Int64& insertedId)
at MySql.Data.MySqlClient.Driver.NextResult(Int32 statementId, Boolean force)
at MySql.Data.MySqlClient.MySqlDataReader.NextResult()
at MySql.Data.MySqlClient.MySqlCommand.ExecuteReader(CommandBehavior behavior)
at MySql.Data.MySqlClient.MySqlCommand.ExecuteNonQuery()
at Dapper.SqlMapper.ExecuteCommand(IDbConnection cnn, CommandDefinition& command, Action`2 paramReader)
at Dapper.SqlMapper.ExecuteImpl(IDbConnection cnn, CommandDefinition& command)
at Dapper.SqlMapper.Execute(IDbConnection cnn, String sql, Object param, IDbTransaction transaction, Nullable`1 commandTimeout, Nullable`1 commandType)
at Hangfire.MySql.MySqlWriteOnlyTransaction.<>c__DisplayClass14_0.<AddToSet>b__0(MySqlConnection x)
at Hangfire.MySql.MySqlWriteOnlyTransaction.<Commit>b__29_0(MySqlConnection connection)
at Hangfire.MySql.MySqlStorage.<>c__DisplayClass18_0.<UseTransaction>b__0(MySqlConnection connection)
at Hangfire.MySql.MySqlStorage.UseConnection[T](Func`2 func)
at Hangfire.MySql.MySqlStorage.UseTransaction[T](Func`2 func, Nullable`1 isolationLevel)
at Hangfire.MySql.MySqlStorage.UseTransaction(Action`1 action)
at Hangfire.MySql.MySqlWriteOnlyTransaction.Commit()
at Hangfire.Client.CoreBackgroundJobFactory.Create(CreateContext context)
at Hangfire.Client.BackgroundJobFactory.<>c__DisplayClass7_0.<CreateWithFilters>b__0()
at Hangfire.Client.BackgroundJobFactory.InvokeClientFilter(IClientFilter filter, CreatingContext preContext, Func`1 continuation)
at Hangfire.Client.BackgroundJobFactory.Create(CreateContext context)
at Hangfire.BackgroundJobClient.Create(Job job, IState state)
--- End of inner exception stack trace ---
at Hangfire.BackgroundJobClient.Create(Job job, IState state)
at Hangfire.BackgroundJobClientExtensions.Schedule[T](IBackgroundJobClient client, Expression`1 methodCall, TimeSpan delay)
at Hangfire.BackgroundJob.Schedule[T](Expression`1 methodCall, TimeSpan delay)

当我使用以下命令检查innodb日志时: SHOW ENGINE INNODB STATUS 我得到以下日志:

=====================================
2018-12-19 14:37:29 0x2ab9c5591700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 53 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 2441 srv_active, 0 srv_shutdown, 13392 srv_idle
srv_master_thread log flush and writes: 15830
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 7531
OS WAIT ARRAY INFO: signal count 8029
RW-shared spins 0, rounds 15152, OS waits 6763
RW-excl spins 0, rounds 15133, OS waits 270
RW-sx spins 58, rounds 1734, OS waits 37
Spin rounds per wait: 15152.00 RW-shared, 15133.00 RW-excl, 29.90 RW-sx
------------------------
LATEST DETECTED DEADLOCK
------------------------
2018-12-19 13:41:01 0x2aba11f50700

***(1) TRANSACTION:

TRANSACTION 88410, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
MySQL thread id 443, OS thread handle 46979012679424, query id 374494 172.31.25.222 cpdbuser update
INSERT INTO `Set` (`Key`, `Value`, `Score`) VALUES (''schedule'', ''475'', 1545313257) ON DUPLICATE KEY UPDATE `Score` = 1545313257

***(1) WAITING FOR THIS LOCK TO BE GRANTED:

RECORD LOCKS space id 147 page no 4 n bits 176 index IX_Set_Key_Value of table `cp-hangfire`.`Set` trx id 88410 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 103 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 8; hex 7363686564756c65; asc schedule;;
 1: len 3; hex 343736; asc 476;;
 2: len 4; hex 80000088; asc     ;;

***(2) TRANSACTION:

TRANSACTION 88408, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 1
MySQL thread id 457, OS thread handle 46978653554432, query id 374490 172.31.25.222 cpdbuser update
INSERT INTO `Set` (`Key`, `Value`, `Score`) VALUES (''schedule'', ''474'', 1545313257) ON DUPLICATE KEY UPDATE `Score` = 1545313257

***(2) HOLDS THE LOCK(S):

RECORD LOCKS space id 147 page no 4 n bits 176 index IX_Set_Key_Value of table `cp-hangfire`.`Set` trx id 88408 lock_mode X locks gap before rec
Record lock, heap no 103 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 8; hex 7363686564756c65; asc schedule;;
 1: len 3; hex 343736; asc 476;;
 2: len 4; hex 80000088; asc     ;;

***(2) WAITING FOR THIS LOCK TO BE GRANTED:

RECORD LOCKS space id 147 page no 4 n bits 176 index IX_Set_Key_Value of table `cp-hangfire`.`Set` trx id 88408 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 103 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 8; hex 7363686564756c65; asc schedule;;
 1: len 3; hex 343736; asc 476;;
 2: len 4; hex 80000088; asc     ;;

***WE ROLL BACK TRANSACTION (1)

请注意,只有一个insert命令的这两个非常简单的事务创建了死锁:

INSERT INTO `Set` (`Key`, `Value`, `Score`) VALUES (''schedule'', ''475'', 1545313257) ON DUPLICATE KEY UPDATE `Score` = 1545313257
INSERT INTO `Set` (`Key`, `Value`, `Score`) VALUES (''schedule'', ''474'', 1545313257) ON DUPLICATE KEY UPDATE `Score` = 1545313257

以下是设置表架构:

下面是set table value和score列的唯一索引:

我发现这个stackoverflow的答案是,即使在我觉得很奇怪的非常正常的情况下,mysql innodb也会产生死锁。无论如何,作为一个解决方案,我尝试使用polly实现一个指数退避重试策略,polly是一个很棒的库。但是这只是推迟了错误,因为现在调度作业的代码被重试,在第3次重试之后,客户端连接由于30秒的nginx响应超时而被简单地删除。
第一个问题:当这个简单的任务调度命令同时执行时,为什么mysql会开始死锁?
第二个问题如果innodb确实会在正常情况下创建死锁,那么mysql将如何在任何一个需要更多并发用户的生产数据库中使用呢?我错过什么了吗?
(来自评论)

CREATE TABLE `Set` (
    `Id` int(11) NOT NULL AUTO_INCREMENT, 
    `Key` varchar(100) NOT NULL, 
    `Value` varchar(256) NOT NULL, 
    `Score` double DEFAULT NULL, 
    `ExpireAt` datetime DEFAULT NULL, 
    PRIMARY KEY (`Id`), 
    UNIQUE KEY `IX_Set_Key_Value` (`Key`,`Value`)
) ENGINE=InnoDB AUTO_INCREMENT=143 DEFAULT CHARSET=latin1
wribegjk

wribegjk1#

第一个问题:我不知道hangfire,但它不太可能只在corebackgroundjobfactory.create中运行一个insert查询。它至少可以在另一个表上执行select,该表可以自己锁定,并且这两个进程的组合可以自己锁定。
第二个问题:innodb的锁定策略取决于事务隔离级别如果你运行的是高并发环境,你可以降低隔离级别:这样会降低死锁的概率。然而,一些酸性副作用可能会出现,即使在我的个人经验中,我没有遇到任何即使读\未限制。您可以尝试将其添加到hangfire数据源配置中,看看会发生什么

相关问题