Replies: 3 comments 2 replies
-
|
Around the time you restarted there will most likely be a stack trace with the full error and reason for why some processes did not start. Those are the logs I need to investigate further. |
Beta Was this translation helpful? Give feedback.
-
|
@dormanze inject a random delay in the 1-15s range to your clients so that they do not all run their declarations at once. There were several efficiency improvements around Khepri in the upcoming |
Beta Was this translation helpful? Give feedback.
-
|
I noticed in my observation logs that not only does creating queues sometimes time out, but adding bindings also experiences timeouts. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Community Support Policy
RabbitMQ version used
4.1.2
Erlang version used
26.2.x
Operating system (distribution) used
linux
How is RabbitMQ deployed?
Generic binary package
rabbitmq-diagnostics status output
See https://www.rabbitmq.com/docs/cli to learn how to use rabbitmq-diagnostics
Logs from node 1 (with sensitive values edited out)
See https://www.rabbitmq.com/docs/logging to learn how to collect logs
Logs from node 2 (if applicable, with sensitive values edited out)
See https://www.rabbitmq.com/docs/logging to learn how to collect logs
Logs from node 3 (if applicable, with sensitive values edited out)
See https://www.rabbitmq.com/docs/logging to learn how to collect logs
rabbitmq.conf
See https://www.rabbitmq.com/docs/configure#config-location to learn how to find rabbitmq.conf file location
Steps to deploy RabbitMQ cluster
config cluster and start
Steps to reproduce the behavior in question
so many quorum queue create
advanced.config
See https://www.rabbitmq.com/docs/configure#config-location to learn how to find advanced.config file location
Application code
# PASTE CODE HERE, BETWEEN BACKTICKSKubernetes deployment file
What problem are you trying to solve?
When my client restarts in batches (rolling upgrade), a large number of queue(1w+ quorum queues and 5w+ exclusive queues) registration requests are generated, and I occasionally receive some error messages. At the same time, we periodically call the API: /api/health/checks/port-listener/5673 to check whether the server status is normal. During the client start, this API also frequently reports errors.
It seems that some I/O timeout errors occurred when creating the queue, and then registering consumers generated a large number of noproc exceptions.


After the client restart was completed, some of my queue statuses became abnormal, and my management interface indicated that one of my instances was experiencing an issue.
I did not encounter this issue in the same environment when using 4.0.x.
I have seen the related PR #15003, could this explain why my consumer creation failed, but is my queue status abnormal due to slow local IO? Are there any parameters I can adjust to change this timeout period?
Beta Was this translation helpful? Give feedback.
All reactions