[SRU] rabbit queues should expire when unused

Bug #1515278 reported by John Eckersberg
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Invalid
Undecided
Unassigned
Liberty
Fix Released
Undecided
Unassigned
oslo.messaging
Fix Released
Undecided
Dmitry Mescheryakov
oslo.messaging (Ubuntu)
Trusty
Confirmed
Medium
Jorge Niedbalski
python-oslo.messaging (Ubuntu)
Fix Released
Medium
Unassigned
Wily
Fix Released
Medium
Jorge Niedbalski

Bug Description

[Description]

RabbitMQ supports queue-level TTLs as described here:

https://www.rabbitmq.com/ttl.html#queue-ttl

This should be used when declaring queues in order to clean up queues that are orphaned for various reasons.

This is an important complement to auto_delete queues. Queues marked auto_delete are only deleted if a consumer existed at some point, and then disconnected, resulting in zero consumers.

[Impact]

Consider the following scenario: a client declares a queue (auto_delete) and binds it to a fan out exchange, but before the client can consume from the queue, it dies.

Because there was never (and will never be) a consumer of this queue, the auto_delete logic does not fire. The queue will live forever, and will collect a copy of every message that is sent to the bound fanout exchange. Given enough published messages to the exchange, the queue will eventually consume all available memory on the broker.

This is bad and we should avoid it by setting a reasonable TTL.

[Test Case]

* Deploy a new cinder service unit
* Create a volume
* Destroy the new cinder service unit.
* List the current active queues:

rabbitmqctl -p openstack list_queues messages consumers name

811 0 q-agent-notifier-security_group update_fanout_c3c4f4d4fc774322938357952ab6d252
3352 0 cinder-volume_fanout_49a14d5511dc4c8c935904eb299a06a8
3352 0 cinder-volume_fanout_634a3e584c064a788676fb29fbf7db23
3352 0 cinder-volume_fanout_ada4f2368cb74d7389998d0adbd15f88
3352 0 cinder-volume_fanout_e896e25b029c4cbd97037909c34803db

Those queues with 0 consumers will grow and remain there forever.

After applying the patch, those queues are removed from the exchange after
10 minutes.

[Regression Potential]

* Not identified.

Changed in oslo.messaging:
assignee: nobody → John Eckersberg (jeckersb)
Changed in oslo.messaging:
status: New → In Progress
Changed in oslo.messaging:
assignee: John Eckersberg (jeckersb) → Dmitry Mescheryakov (dmitrymex)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (master)

Reviewed: https://review.openstack.org/243845
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=10625eed87b4c7f980bd5cd7cacbc4caa2dec197
Submitter: Jenkins
Branch: master

commit 10625eed87b4c7f980bd5cd7cacbc4caa2dec197
Author: John Eckersberg <email address hidden>
Date: Fri Nov 20 17:25:58 2015 -0500

    Kombu: make reply and fanout queues expire instead of auto-delete

    Right now fanout and reply queues are unconditionally created with
    auto-delete flag which causes a number of problems listed in bug
    1495568. Replacing auto-delete with queue expiration with some sane
    timeout should fix all these issues at once.

    Another problem being fixed is that auto-delete flag does not causes
    the queue to be deleted if it never had consumers. An orphaned fanout
    queue might appear that way and it will grow indefinitely until
    somebody manually removes it. See bug 1515278 for details.

    A new rabbit_transient_queues_ttl config parameter is introduced which
    configures the TTL for reply and fanout queues. It is a positive
    integer representing timeout in seconds. By default it is set to 10
    minutes. That should be enough for application to reconnect or
    for server to send reply to client which already died. At the same
    time, it seems that not so many messages could be accumulated in
    fanout queues during that time.

    DocImpact
    With this change RabbitMQ driver defines reply and fanout queues
    differently comparing with the previous release: now they are defined
    with queue TTL (https://www.rabbitmq.com/ttl.html#queue-ttl) instead
    of auto-delete flag. That helps avoid a number of issues, see commit
    description for details. A new rabbit_transient_queues_ttl parameter
    is defined which controls the TTL value. It is set to 10 minutes by
    default. The change does not affect upgrade in any way.

    Closes-bug: #1495568
    Closes-bug: #1515278

    Co-Authored-by: Dmitry Mescheryakov <email address hidden>
    Change-Id: I83a8d09dc0cdae24c12d7043ec810529a9ce57ab

Changed in oslo.messaging:
status: In Progress → Fix Released
Changed in oslo.messaging (Ubuntu):
status: New → Fix Committed
Changed in oslo.messaging (Ubuntu Trusty):
status: New → In Progress
Changed in oslo.messaging (Ubuntu Wily):
status: New → In Progress
Changed in oslo.messaging (Ubuntu Trusty):
assignee: nobody → Jorge Niedbalski (niedbalski)
Changed in oslo.messaging (Ubuntu Wily):
assignee: nobody → Jorge Niedbalski (niedbalski)
tags: added: kilo-backport-potential liberty-backport-potential sts
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "Trusty-Kilo Patch" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.messaging (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/279561

Revision history for this message
Jorge Niedbalski (niedbalski) wrote : Re: rabbit queues should expire when unused
Changed in oslo.messaging (Ubuntu Trusty):
importance: Undecided → Medium
Changed in oslo.messaging (Ubuntu Wily):
importance: Undecided → Medium
Revision history for this message
Jorge Niedbalski (niedbalski) wrote :

Dear Ubuntu Maintainers,

I attached a patch for backporting this fix in Ubuntu Trusty and Wily (Kilo, Liberty).

- https://bugs.launchpad.net/oslo.messaging/+bug/1515278/+attachment/4570118/+files/fix-lp-1515278-liberty.patch
- https://bugs.launchpad.net/oslo.messaging/+bug/1515278/+attachment/4569507/+files/fix-lp-1515278-kilo.patch

Also a proposal for backporting this into OpenStack Stable Liberty:

- https://review.openstack.org/279561

Thanks for your consideration.

Mathew Hodson (mhodson)
Changed in oslo.messaging (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Iain Lane (laney) wrote :

Jorge,

For an update to an Ubuntu stable release, please provide the required information as outlined in <https://wiki.ubuntu.com/StableReleaseUpdates#Procedure>.

Thanks!

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (stable/liberty)

Reviewed: https://review.openstack.org/279561
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=393370072e9f8122967c4864c9dc8a7bda731f3a
Submitter: Jenkins
Branch: stable/liberty

commit 393370072e9f8122967c4864c9dc8a7bda731f3a
Author: Jorge Niedbalski <email address hidden>
Date: Fri Feb 12 11:20:46 2016 -0300

    Kombu: make reply and fanout queues expire instead of auto-delete

    Backport of fix made for bug #1515278.

    Right now fanout and reply queues are unconditionally created with
    auto-delete flag which causes a number of problems listed in bug
    1495568. Replacing auto-delete with queue expiration with some sane
    timeout should fix all these issues at once.

    Original: https://review.openstack.org/#/c/243845/

    Change-Id: Ib69fb7a159eab722fe399f64caa742b77ef45bec
    Closes-bug: #1515278
    Signed-off-by: Jorge Niedbalski <email address hidden>

tags: added: in-stable-liberty
Revision history for this message
Corey Bryant (corey.bryant) wrote : Re: rabbit queues should expire when unused

Jorge,

I've uploaded this to wily and it will require SRU team review before it gets into wily-proposed. Can you update the bug description with the standard SRU sections? [Impact], [Test Case], and [Regression Potential].

Thanks,
Corey

summary: - rabbit queues should expire when unused
+ [SRU] rabbit queues should expire when unused
Louis Bouchard (louis)
tags: added: sts-sru
description: updated
Revision history for this message
Chris J Arges (arges) wrote : Please test proposed package

Hello John, or anyone else affected,

Accepted python-oslo.messaging into wily-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python-oslo.messaging/2.5.0-1ubuntu2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-needed
Revision history for this message
James Page (james-page) wrote :

Hello John, or anyone else affected,

Accepted python-oslo.messaging into liberty-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:liberty-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-liberty-needed to verification-liberty-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-liberty-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in cloud-archive:
status: New → Invalid
tags: added: verification-liberty-needed
Mathew Hodson (mhodson)
no longer affects: oslo.messaging (Ubuntu Wily)
no longer affects: oslo.messaging (Ubuntu)
no longer affects: python-oslo.messaging (Ubuntu Trusty)
Changed in python-oslo.messaging (Ubuntu):
status: New → Fix Committed
Changed in python-oslo.messaging (Ubuntu Wily):
status: New → Fix Committed
Changed in python-oslo.messaging (Ubuntu):
importance: Undecided → Medium
Changed in python-oslo.messaging (Ubuntu Wily):
importance: Undecided → Medium
Changed in python-oslo.messaging (Ubuntu Wily):
status: Fix Committed → In Progress
assignee: nobody → Jorge Niedbalski (niedbalski)
Changed in python-oslo.messaging (Ubuntu):
assignee: nobody → Jorge Niedbalski (niedbalski)
no longer affects: oslo.messaging (Ubuntu Trusty)
Revision history for this message
Ryan Beisner (1chb1n) wrote :

@niedbalski FYI, tempest results from wily-liberty-proposed and trusty-liberty staging are consistent with wily-liberty (distro) and trusty-liberty (uca updates). ie. No new failures. Now that it has been promoted to trusty-liberty-proposed, we will need to re-test there. There is typically a 1-wk bake period for proposed cloud archive pockets.

Changed in python-oslo.messaging (Ubuntu Trusty):
status: New → In Progress
assignee: nobody → Jorge Niedbalski (niedbalski)
Changed in python-oslo.messaging (Ubuntu):
assignee: Jorge Niedbalski (niedbalski) → nobody
Changed in python-oslo.messaging (Ubuntu Trusty):
importance: Undecided → Medium
Revision history for this message
Corey Bryant (corey.bryant) wrote :

Hi Chris and James,

python-oslo.messaging 2.5.0-1ubuntu2 has been verified on wily-proposed and trusty-liberty-proposed.

Thanks,
Corey

tags: added: verification-done verification-liberty-done
removed: verification-liberty-needed verification-needed
Changed in python-oslo.messaging (Ubuntu Wily):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python-oslo.messaging - 2.5.0-1ubuntu2

---------------
python-oslo.messaging (2.5.0-1ubuntu2) wily; urgency=medium

  [ Jorge Niedbalski ]
  * d/p/make-reply-and-fanout-queues-expire-instead-of-auto-delete.patch:
    Make reply and fanout queues expire instead of auto-delete (LP: #1515278).

  [ Corey Bryant ]
  * d/p/dont-hold-connection-when-reply-fail.patch: Cherry-picked
    patch from upstream VCS to fix the amqp reply logic when
    connections are lost (LP: #1521958).

 -- Corey Bryant <email address hidden> Mon, 21 Mar 2016 08:08:47 -0400

Changed in python-oslo.messaging (Ubuntu Wily):
status: Fix Committed → Fix Released
Revision history for this message
Chris J Arges (arges) wrote : Update Released

The verification of the Stable Release Update for python-oslo.messaging has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
James Page (james-page) wrote :

The verification of the Stable Release Update for python-oslo.messaging has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
James Page (james-page) wrote :

This bug was fixed in the package python-oslo.messaging - 2.5.0-1ubuntu2~cloud0
---------------

 python-oslo.messaging (2.5.0-1ubuntu2~cloud0) trusty-liberty; urgency=medium
 .
   * New update for the Ubuntu Cloud Archive.
 .
 python-oslo.messaging (2.5.0-1ubuntu2) wily; urgency=medium
 .
   [ Jorge Niedbalski ]
   * d/p/make-reply-and-fanout-queues-expire-instead-of-auto-delete.patch:
     Make reply and fanout queues expire instead of auto-delete (LP: #1515278).
 .
   [ Corey Bryant ]
   * d/p/dont-hold-connection-when-reply-fail.patch: Cherry-picked
     patch from upstream VCS to fix the amqp reply logic when
     connections are lost (LP: #1521958).

Revision history for this message
Jorge Niedbalski (niedbalski) wrote :
Revision history for this message
Mathew Hodson (mhodson) wrote :

The package is called oslo.messaging in the Ubuntu Trusty archive, so I added that task.

https://launchpad.net/ubuntu/trusty/+source/oslo.messaging

no longer affects: oslo.messaging (Ubuntu)
Changed in oslo.messaging (Ubuntu Trusty):
importance: Undecided → Medium
Changed in python-oslo.messaging (Ubuntu):
status: Fix Committed → Fix Released
Mathew Hodson (mhodson)
no longer affects: python-oslo.messaging (Ubuntu Trusty)
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in oslo.messaging (Ubuntu Trusty):
status: New → Confirmed
Changed in oslo.messaging (Ubuntu Trusty):
assignee: nobody → Jorge Niedbalski (niedbalski)
tags: added: sts-sru-done
removed: sts-sru
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.