rpc.server do not consume messages after message acknowledge failure
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | oslo.messaging |
Medium
|
Mehdi Abaakouk | ||
| | oslo.messaging (Ubuntu) |
High
|
Unassigned | ||
| | Trusty |
High
|
Unassigned | ||
| | Utopic |
High
|
Unassigned | ||
| | Vivid |
High
|
Unassigned | ||
Bug Description
def start(self):
@excutils.
def _executor_thread():
try:
while self._running:
incoming = self.listener.
if incoming is not None:
self.
except greenlet.
return
class Connection did not a lot work to ensure the operation on a connection can recovered after a reconnection. But after we get the incoming message, connection error on message acknowledgement can be raised and caught by the excutils.
Kombu related code is listed below.
def drain_events(self, **kwargs):
return self.transport.
@property
def connection(self):
if not self._closed:
if not self.connected:
return self._connection
-------
[Impact]
This patch addresses an issue where the underlying kombu library disconnects from the rabbitmq-servers, which prevents oslo.messaging
from properly going through the reconnect sequence including the recreation of expected queues. This causes messages to be lost and a generally dysfunctional cloud without restarting services.
[Test Case]
Note steps are for trusty-icehouse, including latest oslo.messaging library (1.3.0-0ubuntu1.1 at the time of this writing).
Deploy an OpenStack cloud w/ multiple rabbit nodes and then abruptly kill one of the rabbit nodes (e.g. force panic, etc). Observe that the nova services do detect that the node went down and report that they are reconnected, but messages are still reporting as timed out, nova service-list still reports compute nodes as down, etc.
[Regression Potential]
There is the possibility that there will be more reconnect attempts from the oslo.messaging library if there is a false positive in the underlying kombu connection reported as disconnected. This should be unlikely since this is bringing the oslo.messaging code into sync with the underlying library, but it is a possibility.
[Other Info]
The attempt to drive reconnection logic was fixed in a recent SRU of oslo.messaging (version 1.3.0-0ubuntu1.1). This is an additional fix that is required in order to allow the oslo.messaging library to not go into a zombie-fied connection state.
| Changed in oslo.messaging: | |
| assignee: | nobody → QingchuanHao (haoqingchuan-28) |
| Changed in oslo.messaging: | |
| importance: | Undecided → Medium |
| status: | New → Confirmed |
| Changed in oslo.messaging: | |
| assignee: | QingchuanHao (haoqingchuan-28) → Mehdi Abaakouk (sileht) |
| status: | Confirmed → In Progress |
| Changed in oslo.messaging: | |
| status: | In Progress → Fix Committed |
| Changed in oslo.messaging: | |
| milestone: | none → 1.11.0 |
| status: | Fix Committed → Fix Released |
| description: | updated |
| no longer affects: | python-oslo.messaging (Ubuntu) |
| Billy Olsen (billy-olsen) wrote : | #2 |
| Billy Olsen (billy-olsen) wrote : | #3 |
| Changed in oslo.messaging (Ubuntu Wily): | |
| status: | New → Fix Released |
| Changed in oslo.messaging (Ubuntu Vivid): | |
| importance: | Undecided → High |
| Changed in oslo.messaging (Ubuntu Trusty): | |
| importance: | Undecided → High |
| Changed in oslo.messaging (Ubuntu Wily): | |
| importance: | Undecided → High |
| James Page (james-page) wrote : | #4 |
trusty and vivid patches reviewed and uploaded for SRU team review.
We're also going to need a fix for utopic, otherwise upgrades to Utopic or Juno from the CA will regress.
| Changed in oslo.messaging (Ubuntu Utopic): | |
| importance: | Undecided → High |
Fix proposed to branch: stable/juno
Review: https:/
| Billy Olsen (billy-olsen) wrote : | #6 |
Here's the debdiff for utopic/juno. This one also includes the pre-requisite patches in LP #1338732.
Hello QingchuanHao, or anyone else affected,
Accepted oslo.messaging into trusty-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-
Further information regarding the verification process can be found at https:/
| Changed in oslo.messaging (Ubuntu Trusty): | |
| status: | New → Fix Committed |
| tags: | added: verification-needed |
| Billy Olsen (billy-olsen) wrote : | #8 |
Verification is done for trusty, but I haven't seen the packages for utopic or vivid yet.
| tags: | added: verification-done-trusty |
| tags: | removed: verification-needed |
| Launchpad Janitor (janitor) wrote : | #9 |
This bug was fixed in the package oslo.messaging - 1.3.0-0ubuntu1.2
---------------
oslo.messaging (1.3.0-0ubuntu1.2) trusty; urgency=medium
* Detect when underlying kombu connection to rabbitmq server has been
disconnected and allow oslo.messaging to go through the reconnect
logic (LP: #1448650):
- d/p/redeclare-
consumers when ack/requeue fails.
-- Billy Olsen <email address hidden> Thu, 25 Jun 2015 09:59:42 +0100
| Changed in oslo.messaging (Ubuntu Trusty): | |
| status: | Fix Committed → Fix Released |
The verification of the Stable Release Update for oslo.messaging has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/juno
commit b6b6edca4672cd5
Author: Mehdi Abaakouk <email address hidden>
Date: Tue May 5 10:29:22 2015 +0200
rabbit: redeclare consumers when ack/requeue fail
In case the acknowledgement or requeue of a message fail,
the kombu transport can be disconnected
In this case, we must redeclare our consumers.
This changes fixes that.
This have no tests because the kombu memory transport we use in our tests
cannot be in disconnected state.
Closes-bug: #1448650
(cherry picked from commit 415db68b67368d7
Conflicts are due to the refactoring to oslo_messaging namespace.
Conflicts:
oslo_
oslo_
Change-Id: I5991a4cf827411
| tags: | added: in-stable-juno |
Hello QingchuanHao, or anyone else affected,
Accepted oslo.messaging into utopic-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-
Further information regarding the verification process can be found at https:/
| Changed in oslo.messaging (Ubuntu Utopic): | |
| status: | New → Fix Committed |
| tags: | added: verification-needed |
| Chris J Arges (arges) wrote : | #13 |
Hello QingchuanHao, or anyone else affected,
Accepted oslo.messaging into vivid-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-
Further information regarding the verification process can be found at https:/
| Changed in oslo.messaging (Ubuntu Vivid): | |
| status: | New → Fix Committed |
Fix proposed to branch: stable/kilo
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/kilo
commit d1e3c38dac2b558
Author: Mehdi Abaakouk <email address hidden>
Date: Tue May 5 10:29:22 2015 +0200
rabbit: redeclare consumers when ack/requeue fail
In case the acknowledgement or requeue of a message fail,
the kombu transport can be disconnected
In this case, we must redeclare our consumers.
This changes fixes that.
This have no tests because the kombu memory transport we use in our tests
cannot be in disconnected state.
Closes-bug: #1448650
(cherry picked from commit 415db68b67368d7
Conflict is due to refactoring on master branch.
Conflicts:
Change-Id: I5991a4cf827411
| tags: | added: in-stable-kilo |
| Changed in oslo.messaging (Ubuntu Utopic): | |
| status: | Fix Committed → Won't Fix |
| Billy Olsen (billy-olsen) wrote : | #16 |
Completed the verification for the vivid-proposed package.
Deployed a vivid-kilo cloud. Booted several instances, in the middle restarted rabbitmq-server. Verified that the amqp messaging layer gets re-established and instances continue to be created.
| tags: |
added: verification-done-vivid removed: verification-needed |
| Launchpad Janitor (janitor) wrote : | #17 |
This bug was fixed in the package oslo.messaging - 1.8.3-0ubuntu0.
---------------
oslo.messaging (1.8.3-
* Detect when underlying kombu connection to rabbitmq server has been
disconnected and allow oslo.messaging to go through the reconnect
logic (LP: #1448650):
- d/p/redeclare-
consumers when ack/requeue fails.
oslo.messaging (1.8.3-
* New upstream point release (LP: #1467959):
- RabbitMQ driver:
+ Adding publisher acknowledgement
of messages during broker shutdown/network failure.
+ Ensure consumer connections closed properly (LP: #1458917).
+ Set timeout on the underlying socket (LP: #1436788).
+ Disable and mark heartbeat as experimental (LP: #1436769).
+ Fix ipv6 support.
- ZeroMQ driver:
+ Don't raise Timeout on no-matchmaker results (LP: #1186310).
+ Fix issue with Redis not deleting expired keys (LP: #1417464).
+ d/p/Fix-
upstream.
-- Billy Olsen <email address hidden> Thu, 25 Jun 2015 09:54:13 +0100
| Changed in oslo.messaging (Ubuntu Vivid): | |
| status: | Fix Committed → Fix Released |
| no longer affects: | oslo.messaging (Ubuntu Wily) |

Reviewed: https:/ /review. openstack. org/180059 /git.openstack. org/cgit/ openstack/ oslo.messaging/ commit/ ?id=415db68b673 68d7c8aa550e710 8122200816e665
Committed: https:/
Submitter: Jenkins
Branch: master
commit 415db68b67368d7 c8aa550e7108122 200816e665
Author: Mehdi Abaakouk <email address hidden>
Date: Tue May 5 10:29:22 2015 +0200
rabbit: redeclare consumers when ack/requeue fail
In case the acknowledgement or requeue of a message fail,
the kombu transport can be disconnected
In this case, we must redeclare our consumers.
This changes fixes that.
This have no tests because the kombu memory transport we use in our tests
cannot be in disconnected state.
Closes-bug: #1448650
Change-Id: I5991a4cf827411 bc27c857561d974 61212a17f40