topic_send may loss messages if the queue not exists

Bug #1661510 reported by JiaJunsu on 2017-02-03
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
oslo.messaging
Medium
Gabriele Santomaggio

Bug Description

If neutron agents get started before server, the agents' messages(sent to server) may be trashed by RabbitMQ.

Oslo.messaging only declare exchange but not queue when send a 'topic' message[1]. In RabbitMQ-server's tutorials, it said 'If we send a message to non-existing location, RabbitMQ will just trash the message'[2].

We've found that may make agents' messages get lost when server is not started. The worse thing is that agents will wait for reply of those messges until timeout, that means agents could not provide service until they get waiting timeout and resend messages to server. We expect agents to receive reply messages and ready to work as soon as the server getting started.

There may be three optional way to solve this:
1.Do not declare exchange when sending 'topic' messages, we will get exception if msg is sent to non-existing exchange.
2.Make sure the queue is exist before sending messages, and raise QueueNotFound exception if we found queue non-existing.
3.Declare the queue before sending messages, just like what notify_send do.

[1]https://github.com/openstack/oslo.messaging/blob/master/oslo_messaging/_drivers/impl_rabbit.py#L1276
[2]https://www.rabbitmq.com/tutorials/tutorial-one-python.html

JiaJunsu (jiajunsu) on 2017-02-08
description: updated
shen.zhixing (fooy5460) wrote :

Can neutron agent wait server to start up? Like "nova compute" call wait_until_ready until "conductor service" is started.

self.conductor_api.wait_until_ready(context.get_admin_context())

Ken Giusti (kgiusti) wrote :

One possible approach would be to set the 'mandatory' flag when sending the message. This should cause the broker to send back the message with a "NO ROUTE" error. Not sure exactly how well this is supported (or if it would even work).

Option 1 isn't guaranteed to work since different servers can use the same exchange. Another server may have already created it for its own queue.

Not sure exactly if option 2 is possible with the kombu library, and even if it was there'd be no guarantee that the queue won't be deleted before the client sends the message.

Option 3 will lead to stale RPC requests building up in rabbit's queues.

Changed in oslo.messaging:
status: New → Confirmed
importance: Undecided → Medium
JiaJunsu (jiajunsu) wrote :

For option 2, py-amqp support argument `passive` in queue_declare[1].
And we may rewrite the function `Queue.declare` to accept argument `passive` and pass it to `queue_declare` in kombu[2].

[1] https://github.com/celery/py-amqp/blob/v2.3.2/amqp/channel.py#L1028
[2] https://github.com/celery/kombu/blob/v4.2.1/kombu/entity.py#L624

The mandatory is the flag for this kind of situation, but in Kombu is not supported [1].
Check if a queue exists for each single publish dorps the performances drastically.

I would work to implement the mandatory flag and handle the important messages.

https://github.com/celery/kombu/blob/maswoter/kombu/messaging.py#L129

Ken Giusti (kgiusti) on 2019-04-29
Changed in oslo.messaging:
assignee: nobody → Gabriele Santomaggio (gsantomaggio)

Fix proposed to branch: master
Review: https://review.opendev.org/659078

Changed in oslo.messaging:
status: Confirmed → In Progress

Fix proposed to branch: master
Review: https://review.opendev.org/660373

Change abandoned by Gabriele Santomaggio (<email address hidden>) on branch: master
Review: https://review.opendev.org/659078
Reason: closed in favor of https://review.opendev.org/#/c/660373/

We are working on it, it seems that Kombu in-memory transport does not support "on_return" function see the issue [1].

I proposed this PR [2], it does not implment "on_return" but at least it does not crash.

1- https://github.com/celery/kombu/issues/1050
2- https://github.com/celery/kombu/pull/1053

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.