Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broker show alive=false in controller getSyncStateSet command #213

Open
drivebyer opened this issue Feb 29, 2024 · 2 comments
Open

Broker show alive=false in controller getSyncStateSet command #213

drivebyer opened this issue Feb 29, 2024 · 2 comments

Comments

@drivebyer
Copy link
Contributor

drivebyer commented Feb 29, 2024

BUG REPORT

  1. Please describe the issue you observed:
    I deployed three controllers, two brokers, and one nameserver using an operator. After ensuring all pods were ready, I executed commands on the nameserver and the controllers.

On the nameserver, I ran the following command:

[root@master0 ~]# kubectl -n mcamel-system exec -it name-service-0 -- ./mqadmin clusterList -n 127.0.0.1:9876
#Cluster Name           #Broker Name            #BID  #Addr                  #Version              #InTPS(LOAD)     #OutTPS(LOAD)  #Timer(Progress)        #PCWait(ms)  #Hour         #SPACE    #ACTIVATED
broker                  broker-0                0     192.168.137.126:10911  V5_1_4                 0.00(0,0ms)       0.00(0,0ms)  0-0(0.0w, 0.0, 0.0)               0  474775.65     0.6800          true
broker                  broker-0                2     192.168.84.199:10911   V5_1_4                 0.00(0,0ms)       0.00(0,0ms)  2-0(0.0w, 0.0, 0.0)               0  474775.65     0.6500         false

The output seemed to be satisfactory.

On the controller, I executed:

[root@master0 ~]# kubectl -n mcamel-system exec -it controller-1 -- ./mqadmin getSyncStateSet -a 127.0.0.1:9878 -c broker -b broker-0

#brokerName	broker-0
#MasterBrokerId	1
#MasterAddr	192.168.137.126:10911
#MasterEpoch	1
#SyncStateSetEpoch	1
#SyncStateSetNums	1

InSyncReplica:	ReplicaIdentity{brokerName='broker-0', brokerId=1, brokerAddress='192.168.137.126:10911', alive=true}

NotInSyncReplica:	ReplicaIdentity{brokerName='broker-0', brokerId=2, brokerAddress='192.168.84.199:10911', alive=false}

It appears that the address 192.168.84.199:10911 is not alive with respect to the controller.

Additionally, I discovered an error log on 192.168.137.126:10911:

2024-02-29 15:50:26 ERROR AutoSwitchHAService_Executor_1 - Error happen when change SyncStateSet, broker:broker-0, masterAddress:192.168.137.126:10911, masterEpoch:1, oldSyncStateSet:[1], newSyncStateSet:[1, 2], syncStateSetEpoch:1
org.apache.rocketmq.client.exception.MQBrokerException: CODE: 2006  DESC: Rejecting alter syncStateSet request because the replicas {2} don't alive
For more information, please visit the url, https://rocketmq.apache.org/docs/bestPractice/06FAQ
	at org.apache.rocketmq.broker.out.BrokerOuterAPI.alterSyncStateSet(BrokerOuterAPI.java:1215)
	at org.apache.rocketmq.broker.controller.ReplicasManager.doReportSyncStateSetChanged(ReplicasManager.java:761)
	at org.apache.rocketmq.store.ha.autoswitch.AutoSwitchHAService.lambda$null$0(AutoSwitchHAService.java:263)
	at java.util.ArrayList.forEach(ArrayList.java:1257)
	at org.apache.rocketmq.store.ha.autoswitch.AutoSwitchHAService.lambda$notifySyncStateSetChanged$1(AutoSwitchHAService.java:263)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
2024-02-29 15:50:30 INFO ReplicasManager_ScheduledService_1 - Update controller leader address to controller-1.controller-svc-headless:9878
2024-02-29 15:50:31 ERROR AutoSwitchHAService_Executor_1 - Error happen when change SyncStateSet, broker:broker-0, masterAddress:192.168.137.126:10911, masterEpoch:1, oldSyncStateSet:[1], newSyncStateSet:[1, 2], syncStateSetEpoch:1
org.apache.rocketmq.client.exception.MQBrokerException: CODE: 2006  DESC: Rejecting alter syncStateSet request because the replicas {2} don't alive
For more information, please visit the url, https://rocketmq.apache.org/docs/bestPractice/06FAQ
	at org.apache.rocketmq.broker.out.BrokerOuterAPI.alterSyncStateSet(BrokerOuterAPI.java:1215)
	at org.apache.rocketmq.broker.controller.ReplicasManager.doReportSyncStateSetChanged(ReplicasManager.java:761)
	at org.apache.rocketmq.store.ha.autoswitch.AutoSwitchHAService.lambda$null$0(AutoSwitchHAService.java:263)
	at java.util.ArrayList.forEach(ArrayList.java:1257)
	at org.apache.rocketmq.store.ha.autoswitch.AutoSwitchHAService.lambda$notifySyncStateSetChanged$1(AutoSwitchHAService.java:263)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
  • What did you expect to see?
    All broker shows alive=true

  • What did you see instead?

  1. Please tell us about your environment:
    RocketMQ 5.1.4

  2. Other information (e.g. detailed explanation, logs, related issues, suggestions how to fix, etc):
    When I deploy a single-replica controller, this issue does not occur.

@drivebyer
Copy link
Contributor Author

@caigy PTAL

@drivebyer
Copy link
Contributor Author

在主库的讨论见:apache/rocketmq#7877

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant