Comment 34 for bug 1800957

Revision history for this message
Antonio Ojea (aojea) wrote :

We have the same issues with nova-compute, some messages get stuck in the queue when using SSL.
We've identified that this problem happen periodically, always in the same queue and with large messages (> 100k characters)

We can observe that suddenly the queue starts to grow

> Every 2.0s: sudo rabbitmqctl list_queues messages consumers name message_bytes messages_unacknowledged > messages_ready head_message_timestamp consumer_utilisation memory state| grep reply >

> 4 1 reply_7271709f3e8b4a51a6e63a647ffd6698 625435 4 0 1.0 36984 > running

then nova-compute starts to log RPC timeout errors like this:

> 2019-01-09 16:56:02.934 2456 ERROR oslo_service.periodic_task MessagingTimeout: Timed out waiting for a reply to message ID e6b115486c244118b1ce4a3438c8fe84

and sometime later the connection is recreated because of heartbeats missed

> 2019-01-09 16:56:22.882 2456 ERROR oslo.messaging._drivers.impl_rabbit [-] [71579db1-c4d8-4e39-9cf1-e547d74fc350] AMQP server on ardana-cp1-c1-m1-mgmt:5671 is unreachable: Too many heartbeats missed. Trying again in 5 seconds. Client port: 50634: ConnectionForced: Too many heartbeats missed

Then we start over the same cycle, messages piling up -> RPC timeout -> heartbeat missed

We tried different combinations, but no luck:

Also Not working
oslo.messaging 5.30.2
amqp 2.1.3
kombu 4.0.1

Also Not working
oslo.messaging 5.30.2
amqp 2.2.1
kombu 4.1.0

Also Not working
oslo.messaging 5.30.6
amqp 2.2.1
kombu 4.1.0