ironic neutron agent leaks quorum queues on restart

Bug #2086303 reported by Will Szumski
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
networking-baremetal
Triaged
Medium
Unassigned

Bug Description

Version: 2024.1

I've noticed that on each restart we leave around a bunch of queues without consumers:

```
(rabbitmq)[rabbitmq@svn1-dr07-u2 /]$ rabbitmqctl list_queues name consumers messages | grep ironic-neutron-agent-member-manager-pool
ironic-neutron-agent-member-manager-pool-305f5cb1-b8e5-4bf6-85e9-96382b27f9bb 0 34
ironic-neutron-agent-member-manager-pool-380e1601-c3fb-4274-a9cc-8292435c03cc 0 76881
ironic-neutron-agent-member-manager-pool-7138ae47-3ba1-426e-859c-a23c1fe8b745 0 97569
ironic-neutron-agent-member-manager-pool-f2109476-a966-417e-83f0-5c350dc64306 0 97188
ironic-neutron-agent-member-manager-pool-f6301ebc-cae4-45ee-bf8c-7b3d6e4ece21 0 1
ironic-neutron-agent-member-manager-pool-39350f77-6df9-4032-8334-63370a2a89df 0 76881
ironic-neutron-agent-member-manager-pool-4baf2136-4717-40f6-802c-7a4b44c06997 0 16792
ironic-neutron-agent-member-manager-pool-bb20f3a6-3d0d-4c4b-8f20-162491492c79 1 0
ironic-neutron-agent-member-manager-pool-60076f83-7765-4743-9916-4f1abec5edb5 0 76881
ironic-neutron-agent-member-manager-pool-e5723e94-a7ed-4689-a183-3aaab2f428d1 0 97903
ironic-neutron-agent-member-manager-pool-507a4e62-d502-4a00-978b-f6d55b23fc9c 0 34
ironic-neutron-agent-member-manager-pool-b480b4dd-7677-47bf-a082-38b10cc7e3b9 0 0
ironic-neutron-agent-member-manager-pool-61f20424-af62-4408-9551-11ee8ba4b8fc 0 3
ironic-neutron-agent-member-manager-pool-95bdf16d-5a95-4e62-bd8a-96b886c6e091 0 34
ironic-neutron-agent-member-manager-pool-d0055061-7867-464a-9852-c1a82a104f81 0 0
ironic-neutron-agent-member-manager-pool-b1d28efe-34a8-4bbc-b313-5fc5279ee702 0 1
ironic-neutron-agent-member-manager-pool-8deca94c-7bb9-42bf-9458-530625151aef 0 16792
ironic-neutron-agent-member-manager-pool-646e0dbb-4028-4f6a-8b9a-3a00b23b480d 0 16792
```

Workaround is to do:

```
for queue in $(rabbitmqctl list_queues name consumers | grep ironic-neutron-agent-member-manager-pool | awk '$NF == 0 {print $1}'); do
    rabbitmqctl delete_queue "$queue"
done
```

post restart.

Related to: https://bugs.launchpad.net/networking-baremetal/+bug/2046962

Will Szumski (willjs)
summary: - ironic neutron agent leaks quorom queues on restart
+ ironic neutron agent leaks quorum queues on restart
Revision history for this message
Sven Kieske (s-kieske) wrote :

this should be fixed by using the queue manager, see: https://review.opendev.org/c/openstack/kolla-ansible/+/924623

Revision history for this message
Will Szumski (willjs) wrote :
Revision history for this message
Sven Kieske (s-kieske) wrote :

the related oslo option, but I'm not sure if we want to enforce this in networking-baremetal directly, imho this setting should be done via deployment project, like the kolla link above:

https://docs.openstack.org/oslo.messaging/latest/configuration/opts.html#oslo_messaging_rabbit.use_queue_manager

Revision history for this message
Will Szumski (willjs) wrote :

>the related oslo option, but I'm not sure if we want to enforce this in networking-baremetal directly, >imho this setting should be done via deployment project, like the kolla link above:

>https://docs.openstack.org/oslo.messaging/latest/configuration/opts.html#oslo_messaging_rabbit.use_queue_manager

Cheers Sven, I just tried toggling that option to true. It doesn't help for me in 2024.1.

Revision history for this message
Sven Kieske (s-kieske) wrote :

well you also need a working queue manager implementation. which release of oslo did you use? did you use devstack or kolla deployment? in kolla this is not yet implemented at all (see the above review by kevko which is still WIP).

I don't know if it works in devstack at all.

there is also this related fix, which is also needed in containerized environments, which was only recently merged: https://review.opendev.org/c/openstack/oslo.messaging/+/928034

also what "It doesn't help" mean, exactly? are the queue names deterministic, like they should be, if queue manager is used? did you verify that the queue manager works?

Revision history for this message
Will Szumski (willjs) wrote (last edit ):

I just toggled that config option (which is what the patch you linked to seemed to be doing). I didn't realize there was an external component. So needs more testing... TBH I was just going to disable ironic neutron agent as I don't use the baremetal mechanism driver. This is kolla. Here are my versions:

(ironic-neutron-agent)[neutron@svn1-dr07-u2 /]$ /var/lib/kolla/venv/bin/pip freeze | grep oslo
oslo.cache==3.7.0
oslo.concurrency==6.0.0
oslo.config==9.4.0
oslo.context==5.5.0
oslo.db==15.0.0
oslo.i18n==6.3.0
oslo.log==5.5.1
oslo.messaging==14.7.1
oslo.metrics==0.8.0
oslo.middleware==6.1.0
oslo.policy==4.3.0
oslo.privsep==3.3.0
oslo.reports==3.3.0
oslo.rootwrap==7.2.0
oslo.serialization==5.4.1
oslo.service==3.4.1
oslo.upgradecheck==2.3.0
oslo.utils==7.1.0
oslo.versionedobjects==3.3.0
oslo.vmware==4.4.0

and by didn't work - I mean the queue names are still randomised.

Revision history for this message
Will Szumski (willjs) wrote :

Also does the queue manager relate to these queues since the docs say: "Queue Manager to build queue name for reply (and fanout) type."?

Afonne-CID (cidelight)
Changed in networking-baremetal:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Pierre Riteau (priteau) wrote :

Is there any configuration option we can change to work around this issue in Caracal/Dalmatian until kolla-ansible uses queue manager?

Will Szumski (willjs)
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.