RabbitMQ fails to synchronize exchanges under high load (Note for ubuntu: stein, rocky, queens(bionic) changes only fix compatibility with fully patched releases)

Bug #1789177 reported by Oleg Bondarev
28
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Undecided
Unassigned
Mitaka
Medium
Seyeong Kim
Queens
Medium
Seyeong Kim
Rocky
Medium
Chris MacNaughton
Stein
Medium
Unassigned
Train
Undecided
Unassigned
oslo.messaging
Undecided
Oleg Bondarev
python-oslo.messaging (Ubuntu)
Medium
Unassigned
Xenial
Medium
Seyeong Kim
Bionic
Medium
Seyeong Kim

Bug Description

[Impact]

If there are many exchanges and queues, after failing over, rabbitmq-server shows us error that exchanges are cannot be found.

Affected
 Bionic (Queens)
Not affected
 Focal

[Test Case]

1. deploy simple rabbitmq cluster
- https://pastebin.ubuntu.com/p/MR76VbMwY5/
2. juju ssh neutron-gateway/0
- for i in {1..1000}; do systemd restart neutron-metering-agent; sleep 2; done
3. it would be better if we can add more exchanges, queues, bindings
- rabbitmq-plugins enable rabbitmq_management
- rabbitmqctl add_user test password
- rabbitmqctl set_user_tags test administrator
- rabbitmqctl set_permissions -p openstack test ".*" ".*" ".*"
- https://pastebin.ubuntu.com/p/brw7rSXD7q/ ( save this as create.sh) [1]
- for i in {1..2000}; do ./create.sh test_$i; done

4. restart rabbitmq-server service or shutdown machine and turn on several times.
5. you can see the exchange not found error

[1] create.sh (pasting here because pastebins don't last forever)
#!/bin/bash

rabbitmqadmin declare exchange -V openstack name=$1 type=direct -u test -p password
rabbitmqadmin declare queue -V openstack name=$1 durable=false -u test -p password 'arguments={"x-expires":1800000}'
rabbitmqadmin -V openstack declare binding source=$1 destination_type="queue" destination=$1 routing_key="" -u test -p password

[Where problems could occur]
1. every service which uses oslo.messaging need to be restarted.
2. Message transferring could be an issue

[Others]

Possible Workaround

1. for exchange not found issue,
- create exchange, queue, binding for problematic name in log
- then restart rabbitmq-server one by one

2. for queue crashed and failed to restart
- delete specific queue in log

// original description

Input:
 - OpenStack Pike cluster with ~500 nodes
 - DVR enabled in neutron
 - Lots of messages

Scenario: failover of one rabbit node in a cluster

Issue: after failed rabbit node gets back online some rpc communications appear broken
Logs from rabbit:

=ERROR REPORT==== 10-Aug-2018::17:24:37 ===
Channel error on connection <0.14839.1> (10.200.0.24:55834 -> 10.200.0.31:5672, vhost: '/openstack', user: 'openstack'), channel 1:
operation basic.publish caused a channel exception not_found: no exchange 'reply_5675d7991b4a4fb7af5d239f4decb19f' in vhost '/openstack'

Investigation:
After rabbit node gets back online it gets many new connections immediately and fails to synchronize exchanges for some reason (number of exchanges in that cluster was ~1600), on that node it stays low and not increasing.

Workaround: let the recovered node synchronize all exchanges - forbid new connections with iptables rules for some time after failed node gets online (30 sec)

Proposal: do not create new exchanges (use default) for all direct messages - this also fixes the issue.

Is there a good reason for creating new exchanges for direct messages?

Changed in oslo.messaging:
assignee: nobody → Oleg Bondarev (obondarev)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.messaging (master)

Fix proposed to branch: master
Review: https://review.openstack.org/596661

Changed in oslo.messaging:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (master)

Reviewed: https://review.openstack.org/596661
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=3a5de89dd686dbd9660f140fdddd9c78b20e1632
Submitter: Zuul
Branch: master

commit 3a5de89dd686dbd9660f140fdddd9c78b20e1632
Author: Oleg Bondarev <email address hidden>
Date: Mon Aug 27 12:18:58 2018 +0400

    Use default exchange for direct messaging

    Lots of exchanges create problems during failover under high
    load. Please see bug report for details.

    This is step 1 in the process: only using default exchange
    when publishing. Consumers will still consume on separate
    exchanges (and on default exchange by default) so this
    should be (and tested to be) a non-breaking and
    upgrade-friendly change.

    Step 2 is to update consumers to only listen on default exchange,
    to happen in T release.

    Change-Id: Id3603f4b7e1274b616d76e1c0c009d2ab7f6efb6
    Closes-Bug: #1789177

Changed in oslo.messaging:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.messaging 9.2.0

This issue was fixed in the openstack/oslo.messaging 9.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.messaging (master)

Fix proposed to branch: master
Review: https://review.opendev.org/669158

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (master)

Reviewed: https://review.opendev.org/669158
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=6fe1aec1c74f112db297cd727d2ea400a292b038
Submitter: Zuul
Branch: master

commit 6fe1aec1c74f112db297cd727d2ea400a292b038
Author: Oleg Bondarev <email address hidden>
Date: Thu Jul 4 16:08:45 2019 +0400

    Use default exchange for direct messaging

    Lots of exchanges create problems during failover under high
    load. Please see bug report for details.

    This is a step 2 patch.

    Step 1 was: only using default exchange
    when publishing.
    Step 2 is to update consumers to only listen on default exchange,
    happening now in T release.

    Change-Id: Ib2ba62a642e6ce45c23568daeef9703a647707f3
    Closes-Bug: #1789177

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.messaging 10.0.0

This issue was fixed in the openstack/oslo.messaging 10.0.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.messaging (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/713153

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (stable/rocky)

Reviewed: https://review.opendev.org/713153
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=b67a457e4fa71e9220c149087dce013c7f81144f
Submitter: Zuul
Branch: stable/rocky

commit b67a457e4fa71e9220c149087dce013c7f81144f
Author: Oleg Bondarev <email address hidden>
Date: Mon Aug 27 12:18:58 2018 +0400

    Use default exchange for direct messaging

    Lots of exchanges create problems during failover under high
    load. Please see bug report for details.

    This is step 1 in the process: only using default exchange
    when publishing. Consumers will still consume on separate
    exchanges (and on default exchange by default) so this
    should be (and tested to be) a non-breaking and
    upgrade-friendly change.

    Step 2 is to update consumers to only listen on default exchange,
    to happen in T release.

    Change-Id: Id3603f4b7e1274b616d76e1c0c009d2ab7f6efb6
    Closes-Bug: #1789177
    (cherry picked from commit 3a5de89dd686dbd9660f140fdddd9c78b20e1632)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.messaging (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.opendev.org/728396

Revision history for this message
norman shen (jshen28) wrote : Re: RabbitMQ fails to synchronize exchanges under high load

Hello, I am also hit by this error log. But I see no exchange inconsistency between different rabbit nodes and I cannot find certain reply_** exchange in each rabbit node.

If I read the code correctly, I believe every time this https://github.com/openstack/oslo.messaging/blob/e44c9883066d9b2d081a594b97aac3d598d491c9/oslo_messaging/_drivers/amqpdriver.py#L323 piece of code is called, it will make sure queue/exchange/bindings will not declared if gone, but I cannot find certain reply_** queue.

I am now wondering if it is possible that the polling thread stops working for some reason and thus exchange will never be declared event after rabbit comes back online.

Revision history for this message
norman shen (jshen28) wrote :

the metrics collected is attached

Revision history for this message
norman shen (jshen28) wrote :

besides I am thinking if this patchset could really solve the problem, because I notice neither exchange or queue could be found in rabbit.

Revision history for this message
zgjun (zgjun) wrote :

I meet the same issue.I have merge this patchset,when queue could be found in rabbit in same case, consumer really received the call msg and reply it , But the publihser wait for call reply timeout without any log output.

so, I think we should make sure the reply queue do exists(redeclare if necessary) before send or after wait timeout.

Revision history for this message
norman shen (jshen28) wrote :

yes, this is also what I think will happen. I think current PS does not solve the root cause.

I do a little bit of debugging and found when problem happens connection, channels seem to be fine, but consumers are gone, and on rabbitmq both exchanges and queues are gone. And since channel does not change, queue and exchanges are not going to be declared again which is problmeatic.

I am using rabbitmq 3.7.4 with min-master enabled, I can trigger the problem quite reliably by tc qdisc add some delay 100ms 10ms on rabbitmq cluster network interface.

Revision history for this message
norman shen (jshen28) wrote :

Here is some debug logs, notice I modified the code a little to print connection name and socket info for consume loop which is used by reply thread.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to oslo.messaging (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/739175

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to oslo.messaging (master)

Reviewed: https://review.opendev.org/739175
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=196fa877a90d7eb0f82ec9e1c194eef3f98fc0b1
Submitter: Zuul
Branch: master

commit 196fa877a90d7eb0f82ec9e1c194eef3f98fc0b1
Author: shenjiatong <email address hidden>
Date: Fri Jul 3 15:51:21 2020 +0800

    Cancel consumer if queue down

    Previously, we have switched to use default exchanges
    to avoid excessive amounts of exchange not found messages.
    But it does not actually solve the problem because
    reply_* queue is already gone and agent will not receive callbacks.

    after some debugging, I found under some circumstances
    seems rabbitmq consumer does not receive basic cancel
    signal when queue is already gone. This might due to
    rabbitmq try to restart consumer when queue is down
    (for example when split brain). In such cases,
    it might be better to fail early.

    by reading the code, seems like x-cancel-on-ha-failover
    is not dedicated to mirror queues only, https://github.com/rabbitmq/rabbitmq-server/blob/master/src/rabbit_channel.erl#L1894,
    https://github.com/rabbitmq/rabbitmq-server/blob/master/src/rabbit_channel.erl#L1926.

    By failing early, in my own test setup,
    I could solve a certain case of exchange not found problem.

    Change-Id: I2ae53340783e4044dab58035bc0992dc08145b53
    Related-bug: #1789177

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to oslo.messaging (stable/ussuri)

Related fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/747366

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to oslo.messaging (stable/train)

Related fix proposed to branch: stable/train
Review: https://review.opendev.org/747892

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to oslo.messaging (stable/ussuri)

Reviewed: https://review.opendev.org/747366
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=0a432c7fb107d04f7a41199fe9a8c4fbd344d009
Submitter: Zuul
Branch: stable/ussuri

commit 0a432c7fb107d04f7a41199fe9a8c4fbd344d009
Author: shenjiatong <email address hidden>
Date: Fri Jul 3 15:51:21 2020 +0800

    Cancel consumer if queue down

    Previously, we have switched to use default exchanges
    to avoid excessive amounts of exchange not found messages.
    But it does not actually solve the problem because
    reply_* queue is already gone and agent will not receive callbacks.

    after some debugging, I found under some circumstances
    seems rabbitmq consumer does not receive basic cancel
    signal when queue is already gone. This might due to
    rabbitmq try to restart consumer when queue is down
    (for example when split brain). In such cases,
    it might be better to fail early.

    by reading the code, seems like x-cancel-on-ha-failover
    is not dedicated to mirror queues only, https://github.com/rabbitmq/rabbitmq-server/blob/master/src/rabbit_channel.erl#L1894,
    https://github.com/rabbitmq/rabbitmq-server/blob/master/src/rabbit_channel.erl#L1926.

    By failing early, in my own test setup,
    I could solve a certain case of exchange not found problem.

    Change-Id: I2ae53340783e4044dab58035bc0992dc08145b53
    Related-bug: #1789177
    (cherry picked from commit 196fa877a90d7eb0f82ec9e1c194eef3f98fc0b1)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to oslo.messaging (stable/stein)

Related fix proposed to branch: stable/stein
Review: https://review.opendev.org/749193

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to oslo.messaging (stable/rocky)

Related fix proposed to branch: stable/rocky
Review: https://review.opendev.org/749194

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to oslo.messaging (stable/queens)

Related fix proposed to branch: stable/queens
Review: https://review.opendev.org/749196

Seyeong Kim (seyeongkim)
Changed in python-oslo.messaging (Ubuntu):
assignee: nobody → Seyeong Kim (seyeongkim)
Seyeong Kim (seyeongkim)
tags: added: sts
description: updated
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "lp1789177_bionic.debdiff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
Seyeong Kim (seyeongkim)
description: updated
Revision history for this message
Mathew Hodson (mhodson) wrote :

This was fixed in 12.3.0
---

python-oslo.messaging (12.3.0-0ubuntu1) groovy; urgency=medium

  [ Chris MacNaughton ]
  * New upstream release for OpenStack Victoria.
  * d/control: Align (Build-)Depends with upstream.
  * d/p/no-functional-test.patch: Refreshed.

  [ Corey Bryant ]
  * d/control: Restore min versions of python3-eventlet and python3-tenacity.

 -- Corey Bryant <email address hidden> Thu, 03 Sep 2020 08:59:27 -0400

Changed in python-oslo.messaging (Ubuntu):
status: New → Fix Released
Mathew Hodson (mhodson)
Changed in python-oslo.messaging (Ubuntu):
importance: Undecided → Medium
Changed in python-oslo.messaging (Ubuntu Bionic):
importance: Undecided → Medium
Revision history for this message
Seyeong Kim (seyeongkim) wrote :

I can't reproduce this symptom in Focal though it is 12.1.0
it doesn't have commit 0a432c7fb107d04f7a41199fe9a8c4fbd344d009

I think xenial need fix as well, I can reproduce this in xenial,
I'm preparing debdiff for xenial as well

Mathew Hodson (mhodson)
Changed in python-oslo.messaging (Ubuntu Xenial):
importance: Undecided → Medium
Revision history for this message
Seyeong Kim (seyeongkim) wrote :

For Stein and Train
There is already commit 3a5de89dd686dbd9660f140fdddd9c78b20e1632
But no 6fe1aec1c74f112db297cd727d2ea400a292b038

I think we need to fix this for both releases as well.
only one fix cannot be solve this issue.

and Train's functional test is already removed
but Stein's one wasn't.

Seyeong Kim (seyeongkim)
Changed in python-oslo.messaging (Ubuntu Xenial):
status: New → In Progress
assignee: nobody → Seyeong Kim (seyeongkim)
Changed in python-oslo.messaging (Ubuntu Bionic):
status: New → In Progress
assignee: nobody → Seyeong Kim (seyeongkim)
Changed in python-oslo.messaging (Ubuntu):
assignee: Seyeong Kim (seyeongkim) → nobody
Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote : Please test proposed package

Hello Oleg, or anyone else affected,

Accepted python-oslo.messaging into train-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:train-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-train-needed to verification-train-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-train-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-train-needed
Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

Hello Oleg, or anyone else affected,

Accepted python-oslo.messaging into stein-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:stein-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-stein-needed to verification-stein-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-stein-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-stein-needed
Revision history for this message
Seyeong Kim (seyeongkim) wrote : Re: RabbitMQ fails to synchronize exchanges under high load

Verification for Stein is done.

ii python3-oslo.messaging 9.5.0-0ubuntu1~cloud1

verification steps
1. reproduce this issue
2. update all python3-oslo.messaging in test env
3. restart rabbitmq-server

all Channel issue is gone.

tags: added: verification-stein-done
removed: verification-stein-needed
Revision history for this message
Seyeong Kim (seyeongkim) wrote :

Verification for Train is done.

ii python3-oslo.messaging 9.7.1-0ubuntu3~cloud1 all oslo messaging library - Python 3.x

verification steps
1. reproduce this issue
2. update all python3-oslo.messaging in test env
3. restart rabbitmq-server

all Channel issue is gone.

tags: added: verification-train-done
removed: verification-train-needed
Revision history for this message
Robie Basak (racb) wrote : Please test proposed package

Hello Oleg, or anyone else affected,

Accepted python-oslo.messaging into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python-oslo.messaging/5.35.0-0ubuntu2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in python-oslo.messaging (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-bionic
Revision history for this message
Robie Basak (racb) wrote : Re: RabbitMQ fails to synchronize exchanges under high load

I don't see anything in the queue for Xenial but I do see a debdiff above, so I'm leaving ~ubuntu-sponsors subscribed (sorry I'm not supposed to both sponsor and SRU review).

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

Thanks Robbie for the acceptance on Bionic. Per some offline discussion, we're planning to hold off on the xenial backport as it will result in additional backports to the cloud archive as well for a bug that we haven't had users hit on Xenial.

To clarify the last statement, the bug has been intentionally reproduced on Xenial, but we haven't heard a report about anybody experiencing it in the wild.

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

s/Robbie/Robie

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote : Please test proposed package

Hello Oleg, or anyone else affected,

Accepted python-oslo.messaging into queens-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:queens-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-queens-needed to verification-queens-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-queens-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-queens-needed
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (python-oslo.messaging/5.35.0-0ubuntu2)

All autopkgtests for the newly accepted python-oslo.messaging (5.35.0-0ubuntu2) for bionic have finished running.
The following regressions have been reported in tests triggered by the package:

python-oslo.versionedobjects/1.31.2-0ubuntu3 (i386, ppc64el, armhf, arm64, amd64, s390x)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/bionic/update_excuses.html#python-oslo.messaging

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Seyeong Kim (seyeongkim) wrote : Re: RabbitMQ fails to synchronize exchanges under high load

Verification done for Bionic

ii python-oslo.messaging 5.35.0-0ubuntu2 all oslo messaging library - Python 2.x

verification steps
1. reproduce this issue
2. update all python3-oslo.messaging in test env
3. restart rabbitmq-server

all Channel issue is gone.

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Seyeong Kim (seyeongkim) wrote :

Verification done for Queens

ii python-oslo.messaging 5.35.0-0ubuntu2~cloud0 all oslo messaging library - Python 2.x

verification steps ( the same as above )
1. reproduce this issue
2. update all python3-oslo.messaging in test env
3. restart rabbitmq-server

all Channel issue is gone.

Revision history for this message
Seyeong Kim (seyeongkim) wrote :

actually for bionic and queens, python-oslo.messaging is correct one , not python3-oslo.messaging

tags: added: verification-queens-done
removed: verification-queens-needed
Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

The autopkgtest failures [1] on this seem unrelated as they are failing to importoslo.versionedobjects and there is no oslo found. This same failure occurs on python-oslo.versionedobjects without using the new python-oslo.messaging from proposed and a later (Groovy) version of python-oslo.versionedobjects has updated debian/tests/python-import to handle the import differently; specifically, the Bionic version is:

set -e

for py in $(py3versions -r 2>/dev/null) ; do
    cd "$AUTOPKGTEST_TMP"
    echo "Testing with $py:"
    $py -c "import oslo.versionedobjects; print(oslo.versionedobjects)"
done

and the Groovy version is:

#!/bin/sh

set -e

MODULE_NAME=$(python3 setup.py --name | sed 's/\./_/g')

for py in $(py3versions -r 2>/dev/null) ; do
    cd "$AUTOPKGTEST_TMP"
    echo "Testing with $py:"
    $py -c "import $MODULE_NAME; print($MODULE_NAME)"
done

The big difference to call out is that the groovy version replaces the dot ('.') after oslo with an underscore for the module import.

[1]: https://people.canonical.com/~ubuntu-archive/proposed-migration/bionic/update_excuses.html#python-oslo.messaging

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote : Update Released

The verification of the Stable Release Update for python-oslo.messaging has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote : Re: RabbitMQ fails to synchronize exchanges under high load

This bug was fixed in the package python-oslo.messaging - 9.7.1-0ubuntu3~cloud1
---------------

 python-oslo.messaging (9.7.1-0ubuntu3~cloud1) bionic-train; urgency=medium
 .
   [ Corey Bryant ]
   * d/gbp.conf: Create stable/train branch.
 .
   [ Chris MacNaughton ]
   * d/control: Update VCS paths for move to lp:~ubuntu-openstack-dev.
 .
   [Seyeong Kim]
   * Fix RabbitMQ fails to syncronize exchanges under high load (LP: #1789177)
     - d/p/0001-Use-default-exchange-for-direct-messaging.patch
     - d/p/0002-Cancel-consumer-if-queue-down.patch

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote : Update Released

The verification of the Stable Release Update for python-oslo.messaging has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote : Re: RabbitMQ fails to synchronize exchanges under high load

This bug was fixed in the package python-oslo.messaging - 9.5.0-0ubuntu1~cloud1
---------------

 python-oslo.messaging (9.5.0-0ubuntu1~cloud1) bionic-train; urgency=medium
 .
   [ Corey Bryant ]
   * d/gbp.conf: Create stable/stein branch.
 .
   [ Chris MacNaughton ]
   * d/control: Update VCS paths for move to lp:~ubuntu-openstack-dev.
 .
    [Seyeong Kim]
    * Fix RabbitMQ fails to syncronize exchanges under high load (LP: #1789177)
      - d/p/0001-Use-default-exchange-for-direct-messaging.patch
      - d/p/0002-Cancel-consumer-if-queue-down.patch

Mathew Hodson (mhodson)
tags: removed: verification-needed
Revision history for this message
Dan Streetman (ddstreet) wrote :

autopkgtest failure analysis:

the only package failing is python-oslo.versionedobjects, and its autopkgtest failures are due to bug 1912792 and can be ignored.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python-oslo.messaging - 5.35.0-0ubuntu2

---------------
python-oslo.messaging (5.35.0-0ubuntu2) bionic; urgency=medium

  [ Corey Bryant ]
  * d/gbp.conf: Create stable/queens branch.

  [ Chris MacNaughton ]
  * d/control: Update VCS paths for move to lp:~ubuntu-openstack-dev.

  [Seyeong Kim]
  * Fix RabbitMQ fails to syncronize exchanges under high load (LP: #1789177)
    - d/p/0001-Use-default-exchange-for-direct-messaging.patch
    - d/p/0002-Cancel-consumer-if-queue-down.patch

 -- Chris MacNaughton <email address hidden> Tue, 05 Jan 2021 10:46:08 +0000

Changed in python-oslo.messaging (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Paul Goins (vultaire) wrote :

I think UCA/Queens should be changed from "fix committed" to "fix released"; we just applied this to a customer cloud today, however at first we mistakenly thought it hadn't been released yet.

(By the way, it resolved the customer's issue, so there's another "works on Bionic/Queens" datapoint.)

Revision history for this message
Edward Hope-Morley (hopem) wrote :

Upgrading to python-oslo.messaging 5.35.0-0ubuntu2~cloud0 from xenial-queens/proposed has just broken my neutron-openvwitch-agent. After upgrade I see a load of MessageTimeout and DuplicateMessage errors in the logs. Downgrading back to 5.35.0-0ubuntu1~cloud0 fixed the problem.

Revision history for this message
Edward Hope-Morley (hopem) wrote :
Download full text (7.0 KiB)

e.g. 2021-02-02 12:07:53.930 27349 ERROR oslo.messaging._drivers.impl_rabbit [-] Failed to process message ... skipping it.: DuplicateMessageError: Found duplicate message(fc9335298407444ab0e7000d3fe2f4b7). Skipping it.
2021-02-02 12:07:53.930 27349 ERROR oslo.messaging._drivers.impl_rabbit Traceback (most recent call last):
2021-02-02 12:07:53.930 27349 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/impl_rabbit.py", line 368, in _callback
2021-02-02 12:07:53.930 27349 ERROR oslo.messaging._drivers.impl_rabbit self.callback(RabbitMessage(message))
2021-02-02 12:07:53.930 27349 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 244, in __call__
2021-02-02 12:07:53.930 27349 ERROR oslo.messaging._drivers.impl_rabbit unique_id = self.msg_id_cache.check_duplicate_message(message)
2021-02-02 12:07:53.930 27349 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqp.py", line 121, in check_duplicate_message
2021-02-02 12:07:53.930 27349 ERROR oslo.messaging._drivers.impl_rabbit raise rpc_common.DuplicateMessageError(msg_id=msg_id)
2021-02-02 12:07:53.930 27349 ERROR oslo.messaging._drivers.impl_rabbit DuplicateMessageError: Found duplicate message(fc9335298407444ab0e7000d3fe2f4b7). Skipping it.

and

2021-02-02 12:05:54.869 27349 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-e53cf710-52f8-4790-bb7a-9968807f842f - - - - -] Error while processing VIF ports: MessagingTimeout: Timed out waiting for a reply to message ID 06bc2386bc6b42f2ad48ebc615
7b3ec6
2021-02-02 12:05:54.869 27349 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent Traceback (most recent call last):
2021-02-02 12:05:54.869 27349 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 2163, in rpc_loop
2021-02-02 12:05:54.869 27349 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent port_info, provisioning_needed)
2021-02-02 12:05:54.869 27349 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 158, in wrapper
2021-02-02 12:05:54.869 27349 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent result = f(*args, **kwargs)
2021-02-02 12:05:54.869 27349 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 1740, in process_network_ports
2021-02-02 12:05:54.869 27349 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent failed_devices['added'] |= self._bind_devices(need_binding_devices)
2021-02-02 12:05:54.869 27349 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 892, in _bind_devices
2021-02-02 12:05:5...

Read more...

tags: added: verification-queens-failed
removed: verification-queens-done
description: updated
Revision history for this message
Corey Bryant (corey.bryant) wrote :

Ed, I'm able to recreate the same issue, but need to narrow it down more as we have new neutron and openvswitch packages in proposed as well. Steps so far:

1) deploy xenial with cloud-archive:queens enabled
2) create instance successfully
3) enable cloud-archive:queens-proposed on all neutron-openvswitch units
4) upgrade python-oslo.messaging on all n-ovs units
5) create instance successfully
6) upgrade openvswitch-switch on all n-ovs units
7) create instance successfully
8) upgrade neutron-common python-neutron neutron-openvswitch-agent on all n-ovs units
9) create instance fails

I'll continue to narrow down on it.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

To recreate just upgrade to the new python-oslo.messaging on all n-ovs units, restart neutron-openvswitch-agent, and instance creation will fail. The problem is the other neutron services on neutron-api and neutron-gateway are using the old olso.messaging code. Once neutron-api and neutron-gateway are upgraded and services restarted, instances can be created again.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

This really isn't a nice situation to introduce into a stable release. I think we should have properly handled this similar to how upstream did, which is to only introduce default exchange publishing in the N-1 release, and then introduce default exchange consumers in the N release. That way, if upgraded correctly, all of the publishing side should be publishing default-only at the time of pre-openstack upgrade, which would allow a seamless upgrade of consumer side to default only, once upgraded.

Revision history for this message
Edward Hope-Morley (hopem) wrote :

@corey.bryant i agree, we shouldn't introduce that kind of behaviour for a stable release

Revision history for this message
Corey Bryant (corey.bryant) wrote :

The same issue exists in bionic-updates. Since it's already in updates you have to downgrade some neutron units (neutron-api and neutron-gateway in my case) to python-oslo.messaging=5.35.0-0ubuntu1 and restart neutron services in order to have the mismatched versions to recreate.

Revision history for this message
Seyeong Kim (seyeongkim) wrote :

I confirmed that upgrading olso.messaging in n-ovs causes rabbitmq issue

Right after restarted n-ovs-agent, I can see a lot of errors in rabbitmq log[1]
which is the same as the error when rabbitmq failover issue ( the original issue of this LP )

Then after I upgraded oslo.messaging in neutron-api unit and restarted neutron-server, below errors are gone and I was able to create instance again.

After upgrading oslo.messaging in n-ovs only, exchange they communicate didn't match.
As changing exchanges they use depends on publisher-cosumer relation.

So I think there are two ways.
1. revert this patch for Q ( original failover problem will be there )
2. upgrade them with maintenance window

Thanks a lot

[1]
################################################################################
=ERROR REPORT==== 3-Feb-2021::03:25:26 ===
Channel error on connection <0.2379.1> (10.0.0.32:60430 -> 10.0.0.34:5672, vhost: 'openstack', user: 'neutron'), channel 1:
{amqp_error,not_found,
            "no exchange 'reply_7da3cecc31b34bdeb96c866dc84e3044' in vhost 'openstack'",
            'basic.publish'}

10.0.0.32 is neutron-api unit

Revision history for this message
Edward Hope-Morley (hopem) wrote :

@seyeongkim the problem here is that we cant make a change in a stable release that requires a maintenance window to upgrade since there will be environments that are not aware of this and e.g. unattended-upgrades that will break when they upgrade. I think the safest action we can take here is cancel the xenial-proposed sru and revert the bionic-updates patch.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

This fix is being reverted fully in queens, rocky, and stein [1]. In order to fast-track the revert it was decided with the SRU team to be best to revert all patches. We can then consider adding patch 1 back as a new SRU following the revert and give that more time for testing.

[1] https://bugs.launchpad.net/oslo.messaging/+bug/1914437

Revision history for this message
Seyeong Kim (seyeongkim) wrote :

I tested below

It was the same scenario I tested
0 deploy test env
- 5.35.0-0ubuntu1~cloud0
1. upgrading olso.messaging in n-ovs
- 5.35.0-0ubuntu2~cloud0 ( from queens-staging launchpad)
2. I got errors
3. upgrading it to new one
- 5.35.0-0ubuntu3~cloud0

it worked fine for me.

I'm trying to reproduce original issue as I want to test 3rd commit only. ( reproduction takes time.. )

I remember that only 1st commit didn't solved original issue in my test.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

I've updated the triage status back to New for releases that were reverted.

Changed in python-oslo.messaging (Ubuntu Bionic):
status: Fix Released → New
Changed in cloud-archive:
status: New → Invalid
Revision history for this message
Corey Bryant (corey.bryant) wrote :

Seyeong, if the first patch helps fix this bug, we can still add it for queens->stein. We were considering keeping that patch but had to revert all of the patches in order to expedite the release with minimum regression potential and testing needed.

Changed in python-oslo.messaging (Ubuntu Bionic):
status: New → Triaged
Revision history for this message
Corey Bryant (corey.bryant) wrote :

I've uploaded new package versions for this bug to the bionic unapproved queue and the rocky-staging ppa. These package versions only include the single patch to switch to the default exchange when publishing.

We have a scenario where queens and train need to communicate but can't because train only consumes from the default exchange. These package updates will fix that scenario. This should also be useful in reducing the impact of the current bug reported here, where RabbitMQ fails to synchronize exchanges under high load.

Note: The single patch to switch to the default exchange when publishing is already in stein so I'm going to mark stein as Fix Released.

Revision history for this message
Corey Bryant (corey.bryant) wrote : Please test proposed package

Hello Oleg, or anyone else affected,

Accepted python-oslo.messaging into rocky-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:rocky-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-rocky-needed to verification-rocky-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-rocky-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-rocky-needed
Revision history for this message
Seyeong Kim (seyeongkim) wrote : Re: RabbitMQ fails to synchronize exchanges under high load

I've confirmed that only 1st patch is ok with below steps

1. deploy queens
2. patch nuetron node's oslo.messaging (1st patch only) except nova-compute node's oslo.messaging
3. trying to create instance and delete

And I keep restarting cinder-scheduler while I blocked one rabbitmq-server with iptables -A INPUT -p tcp --dport 5672 -j DROP

I was able to see no exchange error for cinder eventually.

I'm going to prepare 1st and 3rd commit debdiff for this patch today.

Thanks.

Revision history for this message
Seyeong Kim (seyeongkim) wrote :

ah sorry corey you already uploaded it to bionic as well. thanks

Revision history for this message
Seyeong Kim (seyeongkim) wrote :

1. deploy rocky
2. installed updated oslo.messaging pkg in below nodes
- neutron-api
- neutron-gateway
- nova-compute
- - restarted openvswitch-agent only
3. tried to reproduce with below config
- created 3000 test queue, exchange, bindings
- juju config rabbitmq-server min-cluster-size=1
- juju config rabbitmq-server connection-backlog=200 ( to make all rabbitmq-server restart )
- shutdown node with maas controller ( one of rabbitmq-server)
- power on with maas controller

I'm able to see Channel not found error for nova, and for neutron-openvswitch-agent on nova-compute node.
neutron-openvswitch-agent on nova-compute node has fixed but rabbitmq-server shows me channel not found error.

However, I can't launch and delete instance on this environment.

I'm not sure how to say about this result.
Also reproduction itself is quite hard to make. It took a lot of time to find regular behavior but I'm not sure there is.

Revision history for this message
Seyeong Kim (seyeongkim) wrote :

after restarting all rabbitmq-server, status are stable.

Revision history for this message
Seyeong Kim (seyeongkim) wrote :

With 2nd try, I also faced the same error with patched component, not even only openvswitch-agent.

I'm going to try to reproduce with 1st and 3rd commit with manual configuration( enable_cancel_on_failover)

Revision history for this message
Seyeong Kim (seyeongkim) wrote :

testing 1st, 3rd commit and manual configuration enable_cancel_on_failover = True

I did the similar step above with Queens ( as I already made ppa for this )

and in this case, I see different errors. restarting rabbitmq-server solved error msgs

=ERROR REPORT==== 24-Feb-2021::08:07:46 ===
Channel error on connection <0.23680.14> (10.0.0.36:50874 -> 10.0.0.22:5672, vhost: 'openstack', user: 'neutron'), channel 1:
{amqp_error,not_found,
            "queue 'q-l3-plugin_fanout_81f1be30ba514e1189e4c08e1d99a7d0' in vhost 'openstack' has crashed and failed to restart",
            'queue.declare'}

=ERROR REPORT==== 24-Feb-2021::08:07:46 ===
Channel error on connection <0.23680.14> (10.0.0.36:50874 -> 10.0.0.22:5672, vhost: 'openstack', user: 'neutron'), channel 1:
{amqp_error,not_found,
            "queue 'q-l3-plugin_fanout_81f1be30ba514e1189e4c08e1d99a7d0' in vhost 'openstack' has crashed and failed to restart",
            'queue.declare'}

Revision history for this message
Seyeong Kim (seyeongkim) wrote :

Testing with only 1st patch didn't work , I was able to see the same error as the description in this LP
Testing with 1st and 3rd with manual configuration(enable_cancel_on_failover = True ) showed me different error
( mentnioned above )

The different error is happening less time than I assume.

So I think this can be next action but it is not perfect.
1. Patch 1st and 3rd commit,
2. And patch charms ( to set enable_cancel_on_failover )
3. Then handle different error with different LP bug ( if there is )
(above queens and bionic patch has commit #1 and #3 )

Please give some advices if you have any idea.

Thanks.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

Hi Seyeong,

Thanks for testing. I was under the impression that the 3rd patch was dependent on the 2nd patch since they both deal with the consumer side.

What do you think about moving forward with just patch 1? Unfortunately it doesn't reduce the impact of the original bug reported here (based on your testing) but it does fix the incompatibility of queens and train communication but can't because train only consumes from the default exchange.

Thanks,
Corey

Revision history for this message
Seyeong Kim (seyeongkim) wrote :

Hello Corey

That makes sense to me as well.

Thanks

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Ok, reviewing the change. But question to be answered before this is accepted - seeing comments #80 and #81, does this single patch actually fix the failure to synchronize exchanges under high load? If not, then we need to adjust the description. What will the current patch address in its current state?

Revision history for this message
Liam Young (gnuoy) wrote :

I have tested the rocky scenario that was failing for me. Trilio on Train + OpenStack on Rocky. The Trilio functional test to snapshot a server failed without the fix and passed once python3-oslo.messaging 8.1.0-0ubuntu1~cloud2.2 was installed and services restarted

tags: added: verification-rocky-done
removed: verification-rocky-needed
Seyeong Kim (seyeongkim)
description: updated
Revision history for this message
Corey Bryant (corey.bryant) wrote :

@Łukasz, it's a little awkward. The single patch does not fix the failure to synchronize exchanges under high load (based on Seyeong's testing) however it does fix compatibility with releases that have been fully patched. I've updated the description, hopefully that helps a bit to clear this up.

summary: - RabbitMQ fails to synchronize exchanges under high load
+ RabbitMQ fails to synchronize exchanges under high load (Note for
+ ubuntu: stein, rocky, queens(bionic) changes only fix compatibility with
+ fully patched releases)
Revision history for this message
Corey Bryant (corey.bryant) wrote : Update Released

The verification of the Stable Release Update for python-oslo.messaging has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

This bug was fixed in the package python-oslo.messaging - 8.1.0-0ubuntu1~cloud2.2
---------------

 python-oslo.messaging (8.1.0-0ubuntu1~cloud2.2) bionic-rocky; urgency=medium
 .
   [Seyeong Kim]
   * Fix RabbitMQ fails to syncronize exchanges under high load (LP: #1789177)
     - d/p/0001-Use-default-exchange-for-direct-messaging.patch

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.messaging (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/787384

Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Oleg, or anyone else affected,

Accepted python-oslo.messaging into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python-oslo.messaging/5.35.0-0ubuntu4 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in python-oslo.messaging (Ubuntu Bionic):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-bionic
removed: verification-done-bionic
Revision history for this message
Corey Bryant (corey.bryant) wrote :

Hello Oleg, or anyone else affected,

Accepted python-oslo.messaging into queens-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:queens-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-queens-needed to verification-queens-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-queens-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-queens-needed
removed: verification-queens-failed
Revision history for this message
Seyeong Kim (seyeongkim) wrote :

sorry for being late, I'll verify this soon

Revision history for this message
Seyeong Kim (seyeongkim) wrote :

tested pkg in queens

test steps are below

1. deploy queens env
2. upgrade python-oslo.messaging in nova-compute/0
3. restart neutron-openvswitch-agent ( only )
4. check logs , no error
5. launch instance if it works, no error

ii python-oslo.messaging 5.35.0-0ubuntu4~cloud0 all oslo messaging library - Python 2.x

tags: added: verification-queens-done
removed: verification-queens-needed
Revision history for this message
Seyeong Kim (seyeongkim) wrote :

tested pkg in bionic

steps are below ( the same as queens above )

1. deploy bionic env
2. upgrade python-oslo.messaging in nova-compute/0
3. restart neutron-openvswitch-agent ( only )
4. check logs , no error
5. launch instance if it works, no error

ii python-oslo.messaging 5.35.0-0ubuntu4 all oslo messaging library - Python 2.x

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python-oslo.messaging - 5.35.0-0ubuntu4

---------------
python-oslo.messaging (5.35.0-0ubuntu4) bionic; urgency=medium

  [Seyeong Kim]
  * Fix RabbitMQ fails to syncronize exchanges under high load (LP: #1789177)
    - d/p/0001-Use-default-exchange-for-direct-messaging.patch

 -- Corey Bryant <email address hidden> Tue, 23 Feb 2021 09:53:33 -0500

Changed in python-oslo.messaging (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
James Page (james-page) wrote : Update Released

The verification of the Stable Release Update for python-oslo.messaging has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

tags: added: verification-done
removed: verification-needed
Revision history for this message
James Page (james-page) wrote :

This bug was fixed in the package python-oslo.messaging - 5.35.0-0ubuntu4~cloud0
---------------

 python-oslo.messaging (5.35.0-0ubuntu4~cloud0) xenial-queens; urgency=medium
 .
   * New update for the Ubuntu Cloud Archive.
 .
 python-oslo.messaging (5.35.0-0ubuntu4) bionic; urgency=medium
 .
   [Seyeong Kim]
   * Fix RabbitMQ fails to syncronize exchanges under high load (LP: #1789177)
     - d/p/0001-Use-default-exchange-for-direct-messaging.patch

Changed in python-oslo.messaging (Ubuntu Xenial):
status: In Progress → Invalid
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on oslo.messaging (stable/rocky)

Change abandoned by "Hervé Beraud <email address hidden>" on branch: stable/rocky
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/749194
Reason: rocky is now unmaintained

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on oslo.messaging (stable/pike)

Change abandoned by "Hervé Beraud <email address hidden>" on branch: stable/pike
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/728396
Reason: stable/pike is no longer maintained (https://releases.openstack.org/). Thanks for your understanding

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on oslo.messaging (stable/queens)

Change abandoned by "Hervé Beraud <email address hidden>" on branch: stable/queens
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/749196
Reason: Queens is no longer maintained (https://releases.openstack.org/).

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers