seata集群AT模式下,服务发生错误回滚时,RPC timeout,wait response error:cost 30002 ms

z0qdvdin  于 2个月前  发布在  其他
关注(0)|答案(8)|浏览(42)

Ⅰ. Issue Description

A服务 调用 B服务,A服务发生失败
然后开始回滚,此时要么A服务会出现rpc timeout,要么B服务会出现rpc timeout

而且是seata集群模式情况下,有可能会出现此错误,概率性挺高的,而且通常是项目刚启动的时候

Ⅱ. Describe what happened

A服务发生如下报错:

2022-01-15 17:54:00.518 ERROR 23828 --- [io-10001-exec-4] i.s.c.r.n.AbstractNettyRemotingClient    : wait response error:cost 30002 ms,ip:10.180.41.23:8091,request:xid=10.180.41.23:8091:8169750104099594487,extraData=null
2022-01-15 17:54:00.521 ERROR 23828 --- [io-10001-exec-4] i.seata.tm.api.DefaultGlobalTransaction  : Failed to report global rollback [10.180.41.23:8091:8169750104099594487],Retry Countdown: 5, reason: RPC timeout

2022-01-15 17:55:40.556 ERROR 27940 --- [h_RMROLE_1_2_12] i.s.c.e.AbstractExceptionHandler         : Catch RuntimeException while do RPC, request: xid=10.180.41.23:8091:8169750104099594487,branchId=531645134559981799,branchType=AT,resourceId=jdbc:mysql://10.180.8.109:3306/club,applicationData=null

B服务发生如下报错:

io.seata.common.exception.ShouldNeverHappenException: null
	at io.seata.rm.datasource.DataSourceManager.branchRollback(DataSourceManager.java:149) ~[seata-all-1.4.0.jar:1.4.0]
	at io.seata.rm.AbstractRMHandler.doBranchRollback(AbstractRMHandler.java:125) [seata-all-1.4.0.jar:1.4.0]
	at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:67) ~[seata-all-1.4.0.jar:1.4.0]
	at io.seata.rm.AbstractRMHandler$2.execute(AbstractRMHandler.java:63) ~[seata-all-1.4.0.jar:1.4.0]
	at io.seata.core.exception.AbstractExceptionHandler.exceptionHandleTemplate(AbstractExceptionHandler.java:116) ~[seata-all-1.4.0.jar:1.4.0]
	at io.seata.rm.AbstractRMHandler.handle(AbstractRMHandler.java:63) [seata-all-1.4.0.jar:1.4.0]
	at io.seata.rm.DefaultRMHandler.handle(DefaultRMHandler.java:62) [seata-all-1.4.0.jar:1.4.0]
	at io.seata.core.protocol.transaction.BranchRollbackRequest.handle(BranchRollbackRequest.java:35) [seata-all-1.4.0.jar:1.4.0]
	at io.seata.rm.AbstractRMHandler.onRequest(AbstractRMHandler.java:150) [seata-all-1.4.0.jar:1.4.0]
	at io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.handleBranchRollback(RmBranchRollbackProcessor.java:63) [seata-all-1.4.0.jar:1.4.0]
	at io.seata.core.rpc.processor.client.RmBranchRollbackProcessor.process(RmBranchRollbackProcessor.java:58) [seata-all-1.4.0.jar:1.4.0]
	at io.seata.core.rpc.netty.AbstractNettyRemoting.lambda$processMessage$2(AbstractNettyRemoting.java:278) [seata-all-1.4.0.jar:1.4.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_201]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_201]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.67.Final.jar:4.1.67.Final]
	at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_201]

2022-01-15 17:55:40.557 ERROR 27940 --- [ctor_RMROLE_1_1] i.s.core.rpc.netty.v1.ProtocolV1Encoder  : Encode request error!

java.lang.NullPointerException: null
	at io.seata.serializer.seata.protocol.transaction.AbstractBranchEndResponseCodec.encode(AbstractBranchEndResponseCodec.java:55) ~[seata-all-1.4.0.jar:1.4.0]
	at io.seata.serializer.seata.SeataSerializer.serialize(SeataSerializer.java:47) ~[seata-all-1.4.0.jar:1.4.0]
	at io.seata.core.rpc.netty.v1.ProtocolV1Encoder.encode(ProtocolV1Encoder.java:96) ~[seata-all-1.4.0.jar:1.4.0]
	at io.netty.handler.codec.MessageToByteEncoder.write(MessageToByteEncoder.java:107) [netty-codec-4.1.67.Final.jar:4.1.67.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:717) [netty-transport-4.1.67.Final.jar:4.1.67.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:764) [netty-transport-4.1.67.Final.jar:4.1.67.Final]
	at io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1071) [netty-transport-4.1.67.Final.jar:4.1.67.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:164) [netty-common-4.1.67.Final.jar:4.1.67.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java) [netty-common-4.1.67.Final.jar:4.1.67.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469) [netty-common-4.1.67.Final.jar:4.1.67.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) [netty-transport-4.1.67.Final.jar:4.1.67.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [netty-common-4.1.67.Final.jar:4.1.67.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.67.Final.jar:4.1.67.Final]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.67.Final.jar:4.1.67.Final]
	at java.lang.Thread.run(Thread.java:748) [na:1.8.0_201]

seata发生如下报错

17:55:40.551  INFO --- [     RetryRollbacking_1_1] io.seata.core.rpc.netty.ChannelManager   : Choose [id: 0xd9b16e43, L:/10.180.41.23:8091 - R:/10.180.41.23:57860] on the same IP[10.180.41.23] as alternative of seata-server:10.180.41.23:58135
17:55:40.557 ERROR --- [ttyServerNIOWorker_1_8_12] i.s.c.rpc.netty.AbstractNettyRemoting    : This rpcMessage body[null] is not MessageTypeAware type.
17:56:10.556 ERROR --- [     RetryRollbacking_1_1] i.s.c.rpc.netty.AbstractNettyRemoting    : wait response error:cost 30004 ms,ip:/10.180.41.23:57860,request:xid=10.180.41.23:8091:8169750104099594487,branchId=531645134559981799,branchType=AT,resourceId=jdbc:mysql://10.180.8.109:3306/club,applicationData=null
17:56:10.558 ERROR --- [     RetryRollbacking_1_1] io.seata.server.session.SessionHelper    : handle global session failed: 10.180.41.23:8091:8169750104099594487
==>
java.lang.ArrayStoreException: io.seata.core.exception.BranchTransactionException
        at io.seata.core.logger.StackTraceLogger.buildNewArgs(StackTraceLogger.java:82) ~[seata-core-1.4.2.jar:na]
        at io.seata.core.logger.StackTraceLogger.error(StackTraceLogger.java:61) ~[seata-core-1.4.2.jar:na]
        at io.seata.server.coordinator.DefaultCore.lambda$doGlobalRollback$3(DefaultCore.java:332) ~[seata-server-1.4.2.jar:na]
        at io.seata.server.session.SessionHelper.forEach(SessionHelper.java:182) ~[seata-server-1.4.2.jar:na]
        at io.seata.server.coordinator.DefaultCore.doGlobalRollback(DefaultCore.java:307) ~[seata-server-1.4.2.jar:na]
        at io.seata.server.coordinator.DefaultCoordinator.lambda$handleRetryRollbacking$2(DefaultCoordinator.java:301) [seata-server-1.4.2.jar:na]
        at io.seata.server.session.SessionHelper.forEach(SessionHelper.java:161) ~[seata-server-1.4.2.jar:na]
        at io.seata.server.coordinator.DefaultCoordinator.handleRetryRollbacking(DefaultCoordinator.java:281) [seata-server-1.4.2.jar:na]
        at io.seata.server.coordinator.DefaultCoordinator.lambda$init$5(DefaultCoordinator.java:403) [seata-server-1.4.2.jar:na]
        at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[na:1.8.0_201]
        at java.util.concurrent.FutureTask.runAndReset(Unknown Source) ~[na:1.8.0_201]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown Source) ~[na:1.8.0_201]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) ~[na:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[na:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[na:1.8.0_201]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-all-4.1.30.Final.jar:4.1.30.Final]
        at java.lang.Thread.run(Unknown Source) ~[na:1.8.0_201]
<==

Ⅲ. Environment:

  • JDK version : JDK 8
  • Seata version: 1.4.2
  • OS : centos/win 都一样
1zmg4dgp

1zmg4dgp1#

先对齐使用版本,你的client用的1.4.0

8xiog9wr

8xiog9wr2#

是不是用的hikari?升级最新版client应该就解决了,或者改用druid或手动代理

jm2pwxwz

jm2pwxwz3#

先对齐使用版本,你的client用的1.4.0

1.4.2、1.4.1、1.4.0 都试了一遍,结果都一样

w80xi6nr

w80xi6nr4#

是不是用的hikari?升级最新版client应该就解决了,或者改用druid或手动代理

是的,用的是Springboot 2.5.4 自带的hikari,一开始集成的时候用了最新的nacos-client 2.x版本,但一直报错无法找到seata的服务,所以才降到1.4.2版本

0x6upsns

0x6upsns5#

it looks like a datasource proxy problem at startup.

lnlaulya

lnlaulya6#

it looks like a datasource proxy problem at startup.

yes,只要出现dataSourceProxy=null的情况,就会出现rpc 超时30s

DataSourceManager.class

@Override
    public BranchStatus branchRollback(BranchType branchType, String xid, long branchId, String resourceId,
                                       String applicationData) throws TransactionException {
        DataSourceProxy dataSourceProxy = get(resourceId);
        if (dataSourceProxy == null) {
            // 此处抛出错误了,回滚的时候,A服务 获取 B服务 的mysql 链接,此时的resourceId是 B服务 数据库的链接,而dataSourceCache 里没有  B服务 数据库的链接
            throw new ShouldNeverHappenException();
        }
        ....省略
    }
t9aqgxwy

t9aqgxwy7#

it looks like a datasource proxy problem at startup.

yes,只要出现dataSourceProxy=null的情况,就会出现rpc 超时30s

DataSourceManager.class

@Override
    public BranchStatus branchRollback(BranchType branchType, String xid, long branchId, String resourceId,
                                       String applicationData) throws TransactionException {
        DataSourceProxy dataSourceProxy = get(resourceId);
        if (dataSourceProxy == null) {
            // 此处抛出错误了,回滚的时候,A服务 获取 B服务 的mysql 链接,此时的resourceId是 B服务 数据库的链接,而dataSourceCache 里没有  B服务 数据库的链接
            throw new ShouldNeverHappenException();
        }
        ....省略
    }

那应该我记错了,应该是develop对这个情况进行了进一步修复,改为druid会解决,或者尝试手动代理,关闭自动代理

iugsix8n

iugsix8n8#

我也遇到过这个问题
原因:本地开发存在超大事务的情况下,seata向lock_table中记录的数据量很多,seata-server的数据库又是延迟较高的远程数据库,这个过程超过了30秒,导致报错。
解决:修改seata-server的存储库为低延迟的本地库。1.5.x版本可以设置transport.rpcTcRequestTimeout参数增加超时时间

相关问题