Comment 2 for bug 1993149

Revision history for this message
Przemyslaw Hausman (phausman) wrote :

I can confirm I was able to reproduce the issue in a separate Focal/Yoga environment.

I noticed that even without shutting down rabbitmq-server leader unit, client (e.g. nova-cloud-controller) keep disconnecting from non-leader rabbitmq-server units, see /var/log/nova/nova-api-wsgi.log:

```
2022-10-18 11:33:58.928 207484 ERROR oslo.messaging._drivers.impl_rabbit [-] [e4a6f33c-f700-4aa0-b84a-4bc045ead67b] AMQP server on 192.168.30.233:5672 is unreachable: <RecoverableConnectionError: unknown error>. Trying again in 1 seconds.: amqp.exceptions.RecoverableConnectionError: <RecoverableConnectionError: unknown error>
2022-10-18 11:36:50.937 207485 ERROR oslo.messaging._drivers.impl_rabbit [-] [dc9b1800-ed78-4276-bcec-c059a29c8f54] AMQP server on 192.168.30.255:5672 is unreachable: <RecoverableConnectionError: unknown error>. Trying again in 1 seconds.: amqp.exceptions.RecoverableConnectionError: <RecoverableConnectionError: unknown error>
2022-10-18 11:36:51.024 207486 ERROR oslo.messaging._drivers.impl_rabbit [-] [1f708e80-fb6b-439c-914b-a8ab20c52f19] AMQP server on 192.168.30.255:5672 is unreachable: <RecoverableConnectionError: unknown error>. Trying again in 1 seconds.: amqp.exceptions.RecoverableConnectionError: <RecoverableConnectionError: unknown error>
2022-10-18 11:38:23.586 207485 INFO oslo.messaging._drivers.impl_rabbit [-] [dc9b1800-ed78-4276-bcec-c059a29c8f54] Reconnected to AMQP server on 192.168.30.255:5672 via [amqp] client with port 58520.
2022-10-18 11:38:23.696 207484 ERROR oslo.messaging._drivers.impl_rabbit [-] [e4a6f33c-f700-4aa0-b84a-4bc045ead67b] AMQP server on 192.168.30.233:5672 is unreachable: <RecoverableConnectionError: unknown error>. Trying again in 1 seconds.: amqp.exceptions.RecoverableConnectionError: <RecoverableConnectionError: unknown error>
2022-10-18 11:38:23.741 207487 ERROR oslo.messaging._drivers.impl_rabbit [-] [7e1c0354-dd18-4953-a70a-1bce9cfdc31e] AMQP server on 192.168.30.233:5672 is unreachable: <RecoverableConnectionError: unknown error>. Trying again in 1 seconds.: amqp.exceptions.RecoverableConnectionError: <RecoverableConnectionError: unknown error>
2022-10-18 11:38:23.819 207486 INFO oslo.messaging._drivers.impl_rabbit [-] [1f708e80-fb6b-439c-914b-a8ab20c52f19] Reconnected to AMQP server on 192.168.30.255:5672 via [amqp] client with port 58522.
```

In the above snippet, only the two non-leader rabbitmq-server units (192.168.30.233 and 192.168.30.255) keep disconnecting every few minutes. I did not notice disconnections of the third (leader) unit.

Adding ~field-critical as this is blocking the customer deployment.