oslo.messaging

Bug #1905965
Comment #6

Comment 6 for bug 1905965

Revision history for this message

John Eckersberg (jeckersb) wrote on 2021-01-05:

re: Ken's comment:

"""
However from the user's point of view (in the case nova) it's a breaking API change. Mea culpa - turning on the mandatory feature by default should not have been done without making a whole lot more public noise than I did.
"""

I agree with this. The main issue is that oslo.messaging will now raise a MessageUndeliverable exception with the mandatory flag set. This exception was added here:

https://github.com/openstack/oslo.messaging/commit/c50076b4efb79cef46d618d6d80eecbcebb72898

And then later started getting raised when the mandatory flag was added to direct_send here:

https://github.com/openstack/oslo.messaging/commit/b7e9faf6590086b79f9696183b5c296ecc12b7b6

This is the detail that is API-breaking; nova and the like are not prepared to catch this exception and handle it.

However, we already do almost the exact same thing presently, albeit in a slightly different manner.

In the base amqpdriver here:

https://github.com/openstack/oslo.messaging/blob/e44c988/oslo_messaging/_drivers/amqpdriver.py#L151-L169

We catch AMQPDestinationNotFound and loop for missing_destination_retry_timeout seconds. This is accomplished by setting the passive=True flag on the exchange, and then probing to see if the exchange exists by issuing a exchange.declare. If the exchange is already present, exchange.declare-ok is returned. If the exchange is not present, a channel error is raised and the exchange is *not* declared. This is so the replying end can probe to see when the original requester re-establishes itself (it is on the sender to declare the exchange).

I think for now we could get away with simply catching the MessageUndeliverable exception in the same place, and we could loop for the same timeout while it remains undeliverable. The only problem I see is that the blueprint spec for the transport options is there explicitly so that the rpc client itself can declare at_least_once=True and expect to get back MessageUndeliverable if that fails. So we could keep track of the transport_options originally provided by the client. If the client explicitly set at_least_once=True, then when MessageUndeliverable is raised, we should re-raise it back to the client, with the assumption that it knows how to handle it (possibly after retrying some number of times?). However, if the client did not explicitly ask for it (the current case where nova and such do not use this API), then we can still use the functionality within oslo.messaging, but we need to catch it and not re-raise it, the same way AMQPDestinationNotFound works currently.

re: Ken's comment:

I agree with this.  The main issue is that oslo.messaging will now raise a MessageUndeliverable exception with the mandatory flag set.  This exception was added here:

https://github.com/openstack/oslo.messaging/commit/c50076b4efb79cef46d618d6d80eecbcebb72898

And then later started getting raised when the mandatory flag was added to direct_send here:

https://github.com/openstack/oslo.messaging/commit/b7e9faf6590086b79f9696183b5c296ecc12b7b6

This is the detail that is API-breaking; nova and the like are not prepared to catch this exception and handle it.

However, we already do almost the exact same thing presently, albeit in a slightly different manner.

In the base amqpdriver here:

https://github.com/openstack/oslo.messaging/blob/e44c988/oslo_messaging/_drivers/amqpdriver.py#L151-L169

We catch AMQPDestinationNotFound and loop for missing_destination_retry_timeout seconds.  This is accomplished by setting the passive=True flag on the exchange, and then probing to see if the exchange exists by issuing a exchange.declare.  If the exchange is already present, exchange.declare-ok is returned.  If the exchange is not present, a channel error is raised and the exchange is *not* declared.  This is so the replying end can probe to see when the original requester re-establishes itself (it is on the sender to declare the exchange).

I think for now we could get away with simply catching the MessageUndeliverable exception in the same place, and we could loop for the same timeout while it remains undeliverable.  The only problem I see is that the blueprint spec for the transport options is there explicitly so that the rpc client itself can declare at_least_once=True and expect to get back MessageUndeliverable if that fails.  So we could keep track of the transport_options originally provided by the client.  If the client explicitly set at_least_once=True, then when MessageUndeliverable is raised, we should re-raise it back to the client, with the assumption that it knows how to handle it (possibly after retrying some number of times?).  However, if the client did not explicitly ask for it (the current case where nova and such do not use this API), then we can still use the functionality within oslo.messaging, but we need to catch it and not re-raise it, the same way AMQPDestinationNotFound works currently.