Comment 9 for bug 1463802

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote : Re: RPC clients do not recreate a reply queue after restart of the last RabbitMQ server in the cluster

@Igor

Short answer, no it won't, as the root cause of the problem is that oslo.messaging does not recreate a reply queue, so we are going to get Timeout error again and again for each subsequent RPC call.

Long answer, handling of Timeout errors *might* help us to mitigate short RabbitMQ server disruptions, if we missed some requests/replies. But it's a much bigger question. The way we look at MQ in oslo.messaging is just a layer to implement simple RPC protocol upon, treating all local/remote process calls in the very same way.

The problem with that is that, if you wanted to handle Timeout errors gracefully, you would end up retrying call possible RPC calls in your code. And not all of those calls are idempotent (i.e. can be safely retried). So we could do that in oslo.messaging, but the consequences might be even worse than without retries.

There is nothing specific in Nova here, it's how all OpenStack projects work with RPC right now.