Blocked publishers due to rabbitmq disk/memory alarm causes indefinitely frozen RPC calls with zero errors

Bug #1454449 reported by Assaf Muller
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
oslo.messaging
Fix Released
Medium
Mehdi Abaakouk

Bug Description

From the rabbit server log:
=INFO REPORT==== 12-May-2015::18:21:04 ===
Disk free space insufficient. Free bytes:974610432 Limit:1000000000

=WARNING REPORT==== 12-May-2015::18:21:04 ===
disk resource limit alarm set on node rabbit@localhost.

**********************************************************
*** Publishers will be blocked until this alarm clears ***
**********************************************************

If a disk or RAM rabbitmq alarm goes off:
https://www.rabbitmq.com/alarms.html

All connections will be blocked:
[stack@localhost ~]$ sudo rabbitmqctl list_connections
Listing connections ...
stackrabbit 10.0.0.200 42587 blocking
stackrabbit 10.0.0.200 42588 blocking
...

The user experience in Oslo messaging here is very poor: There's just no indication whatsoever. There's no timeouts on RPC calls, they just freeze indefinitely. There's no error in any log in any OpenStack service. I'm not familiar with the rabbit driver backend, I'm hoping that it would be possible to get an error out of it somehow.

Assaf Muller (amuller)
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.messaging (master)

Fix proposed to branch: master
Review: https://review.openstack.org/185851

Changed in oslo.messaging:
assignee: nobody → Mehdi Abaakouk (sileht)
status: New → In Progress
Mehdi Abaakouk (sileht)
Changed in oslo.messaging:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (master)

Reviewed: https://review.openstack.org/185851
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=1f8ccd3ac50aa941f042b02069b7185c3d5b5ae7
Submitter: Jenkins
Branch: master

commit 1f8ccd3ac50aa941f042b02069b7185c3d5b5ae7
Author: Mehdi Abaakouk <email address hidden>
Date: Wed May 27 08:33:25 2015 +0200

    rabbit: Add logging on blocked connection

    When the broker will block the connection for a server-side issue
    like disk full, it notifies the client.

    This change adds the callback methods when this occurs to inform
    the deployer about the reason of this blocking.

    Change-Id: I5164b9e1b720f022b45a5718258df036ba8808ed
    Closes-bug: #1454449

Changed in oslo.messaging:
status: In Progress → Fix Committed
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Will this patch give developers to know if publish was blocked? Normally, only publish should be blocked. But consuming should be available (I'm not sure if by the same connection or only by a new one, though). Could you please make the logged information to be 100% clear about: a) publish blocked via connections, b) consume blocked via connections.

If we could be able to see the case b), this would be a bug in the underlying AMQP layer.

Changed in oslo.messaging:
milestone: none → 1.15.0
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.