Comment 4 for bug 1487517

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Indeed. Good catch, thank you.
We discussed this with Dmitry Mescheryakov, and he suggested to fix this not directly, but instead introduce a rabbitmqctl failures counter to be kept in CIB alongside with rabbit uptime.

The issue is that is seems, there are too many restart actions inducted by hanged rabbitmqctl * checks under load. Fixing this bug directly, would have onle increased restarts numbers under high load even more. So, the idea is to increase that counter on each rabbitmqctl failed check and report the resource failed only then this threshold has exceeded.