Comment 0 for bug 901449

Revision history for this message
Francis J. Lacoste (flacoste) wrote : Rabbit failure when sending an OOPS seems to hang the producer

On 2011-12-07, we had a RabbitMQ outage which degraded into cascading failures in the Launchpad app servers.

The working theory is that a really big OOPS triggered the high VM watermark error in Rabbit:

=INFO REPORT==== 7-Dec-2011::21:06:44 === alarm_handler: {set,{vm_memory_high_watermark,[]}}

Shortly after that, app servers stopped responding to Nagios checks (10s socket timeout).

We aren't probably handling that Rabbit error very well.