When using rabbit mq in multi controller environment (HA), if a controller node (with rabbitmq service) goes down, it can take a a long time (15 minutes) for compute nodes to automatically recover and begin using the alternate rabbitmq service. There existing heartbeat configuration options in oslo rabbitmq support to better handle these situations.
cfg.IntOpt('heartbeat_timeout_threshold',
default=0,
help="Number of seconds after which the Rabbit broker is "
"considered down if heartbeat's keep-alive fails "
"(0 disable the heartbeat). EXPERIMENTAL"),
cfg.IntOpt('heartbeat_rate',
default=2,
help='How often times during the heartbeat_timeout_threshold '
'we check the heartbeat.'),
These will be added to Common messaging attributes, and then to all the cookbook that uses rabbitmq.
When using rabbit mq in multi controller environment (HA), if a controller node (with rabbitmq service) goes down, it can take a a long time (15 minutes) for compute nodes to automatically recover and begin using the alternate rabbitmq service. There existing heartbeat configuration options in oslo rabbitmq support to better handle these situations.
https:/ /github. com/openstack/ oslo.messaging/ blob/d685e6f80a 5dfc5fba638beac de762c8ccf9a89d /oslo_messaging /_drivers/ impl_rabbit. py#L138
cfg.IntOpt( 'heartbeat_ timeout_ threshold' ,
default=0,
help="Number of seconds after which the Rabbit broker is "
"considered down if heartbeat's keep-alive fails "
"(0 disable the heartbeat). EXPERIMENTAL"),
cfg.IntOpt( 'heartbeat_ rate', timeout_ threshold '
default=2,
help='How often times during the heartbeat_
'we check the heartbeat.'),
These will be added to Common messaging attributes, and then to all the cookbook that uses rabbitmq.