[Bug] ClientManagerImpl 的 rpcClientTableLock 存在异常网络情况的写锁hang住,导致与broker交互全面阻塞 #857
Open
3 tasks done
Labels
type/bug
Something isn't working
Before Creating the Bug Report
I found a bug, not just asking a question, which should be created in GitHub Discussions.
I have searched the GitHub Issues and GitHub Discussions of this repository and believe that this is not a duplicate.
I have confirmed that this bug belongs to the current repository, not other repositories of RocketMQ.
Programming Language of the Client
Java
Runtime Platform Environment
CentOS Linux 7 (Core)
RocketMQ Version of the Client/Server
server: rocketmq-all-5.1.4
client: 5.0.7
Run or Compiler Version
java version "1.8.0_251"
Java(TM) SE Runtime Environment (build 1.8.0_251-b08)
Java HotSpot(TM) 64-Bit Server VM (build 25.251-b08, mixed mode)
Describe the Bug
在生产运行环境发现应用在正常运行中,各模块与MQ的交互都停止了,mq客户端的相关日志都没有输出,重启应用模块后恢复,查看堆栈发现都卡在 ClientManagerImpl 的 rpcClientTableLock 写锁
Steps to Reproduce
当前已知的是,rocketMq 5.X 版本存在暴力探活造成的内存泄漏(https://github.com/apache/rocketmq/issues/8875),进而造成相关网络链接异常,可以导致客户端链接全面卡住
What Did You Expect to See?
ClientManagerImpl 与 MqBroker的交互不应该被部分异常的网络链接所阻塞
What Did You See Instead?
当前应用端受到部分与mqbroker的异常链接清理,全面阻塞了与mqbroker的交互。
Additional Context
No response
The text was updated successfully, but these errors were encountered: