restart neutron ovs agent will leave the fanout queue behind

Bug #1586731 reported by yong sheng gong
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Low
Unassigned

Bug Description

to reproduce,
sudo rabbitmqctl list_queues
restart neutron-openvswitch-agent
sudo rabbitmqctl list_queues

q-agent-notifier-dvr-update 0
q-agent-notifier-dvr-update.ubuntu64 0
q-agent-notifier-dvr-update_fanout_714f4e99b33a4a41863406fcc26b9162 0
q-agent-notifier-dvr-update_fanout_a2771eb21e914195b9a6cc3f930b5afb 0
q-agent-notifier-l2population-update 0
q-agent-notifier-l2population-update.ubuntu64 0
q-agent-notifier-l2population-update_fanout_6b2637e57995416ab772259a974315e0 3
q-agent-notifier-l2population-update_fanout_fe9c07aaa8894f55bfb49717f955aa55 0
q-agent-notifier-network-update 0
q-agent-notifier-network-update.ubuntu64 0
q-agent-notifier-network-update_fanout_1ae903109fe844a39c925e49d5f06498 0
q-agent-notifier-network-update_fanout_8c15bef355c645e58226a9b98efe3f28 0
q-agent-notifier-port-delete 0
q-agent-notifier-port-delete.ubuntu64 0
q-agent-notifier-port-delete_fanout_cd794c4456cc4bedb7993f5d32f0b1b9 0
q-agent-notifier-port-delete_fanout_f09ffae3b0fa48c882eddd59baae2169 0
q-agent-notifier-port-update 0
q-agent-notifier-port-update.ubuntu64 0
q-agent-notifier-port-update_fanout_776b9b5b1d0244fc8ddc0a1e309d9ab2 0
q-agent-notifier-port-update_fanout_f3345013434545fd9b72b7f54a5c9818 0
q-agent-notifier-security_group-update 0
q-agent-notifier-security_group-update.ubuntu64 0
q-agent-notifier-security_group-update_fanout_b5421c8ae5e94c318502ee8fbc62852d 0
q-agent-notifier-security_group-update_fanout_f4d73a80c9a9444c8a9899cbda3e71ed 0
q-agent-notifier-tunnel-delete 0
q-agent-notifier-tunnel-delete.ubuntu64 0
q-agent-notifier-tunnel-delete_fanout_743b58241f6243c0a776a0dbf58da652 0
q-agent-notifier-tunnel-delete_fanout_ddb8fad952b348a8bf12bc5c741d0a25 0
q-agent-notifier-tunnel-update 0
q-agent-notifier-tunnel-update.ubuntu64 0
q-agent-notifier-tunnel-update_fanout_1e0b0f7ca63f404ba5f41def9d12f00d 0
q-agent-notifier-tunnel-update_fanout_e86e9b073ec74766b9e755439827badc 1

Revision history for this message
Miguel Angel Ajo (mangelajo) wrote :

As far as I undertand, the unused queues will go away, as soon as the consumer is processed as disconnected by rabbitmq, it's an AMQP feature.

If that's not working, it's not neutron fault, but rabbitmq as far as I understand.

Changed in neutron:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for neutron because there has been no activity for 60 days.]

Changed in neutron:
status: Incomplete → Expired
Revision history for this message
Quan Tian (tianquan23) wrote :

This is an expected behavior if you are using oslo.messaging version 4.1.0 or higher, since [1] made the changes that make reply and fanout queues expire instead of auto-delete. The stale fanout queues should be deleted after rabbit_transient_queues_ttl seconds.

[1] https://review.openstack.org/#/c/243845/

Revision history for this message
Felix Huettner (felix.huettner) wrote :

This bug actually still exists and is not expected behaviour.
A restart of all other openstack services does not lead to stray fanout queues being left in rabbitmq.

I could trace this behaviour to a failure to close the rabbitmq connections of the RemoteResourceCache.
Please find a minimal patch for this below.

This gets rid of the fanout queues for Port, subnet, network, securitygroup, securitygrouprule and addressgroup. However there are still the ones for SubPort and Trunk left since i did not find out where i can close the connections there.

Changed in neutron:
status: Expired → New
Revision history for this message
Brian Haley (brian-haley) wrote :

Can you submit a patch to master branch based on your partial patch? Thanks.

Changed in neutron:
importance: Undecided → Low
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/855851

Changed in neutron:
status: New → In Progress
Revision history for this message
Brian Haley (brian-haley) wrote :

Thanks for the patch, here's some more info I found.

So I think the trunk, and maybe subport, fanout queue is due to this code in ovs_neutron_agent.py:

def main(bridge_classes):
    ovs_capabilities.register()
    ...

That call will trigger a call to (I think) neutron/services/trunk/drivers/openvswitch/agent/driver.py:init_handler() with registers the trunk "skeleton". Eventually this code in the trunk agent.py:__init__: code gets invoked:

        self._connection = n_rpc.Connection()
        endpoints = [resources_rpc.ResourcesPushRpcCallback()]
        topic = resources_rpc.resource_type_versioned_topic(resources.SUBPORT)
        self._connection.create_consumer(topic, endpoints, fanout=True)
        topic = resources_rpc.resource_type_versioned_topic(resources.TRUNK)
        self._connection.create_consumer(topic, endpoints, fanout=True)
        self._connection.consume_in_threads()

At the end of main() in the agent, there is no 'unregister' call, in fact there is no unregister code that I could find.

The goal would be to eventually call neutron_lib.rpc.py:Connection.close(), basically undoing what the init() did above.

Does that make sense? It did when I looked closer.

Revision history for this message
Felix Huettner (felix.huettner) wrote :

Thanks for that information. I'm building a patch based on that.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/856411

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/855851
Committed: https://opendev.org/openstack/neutron/commit/9ff46546cb36ab93504ae733f273394d5ccd9be4
Submitter: "Zuul (22348)"
Branch: master

commit 9ff46546cb36ab93504ae733f273394d5ccd9be4
Author: Felix Huettner <email address hidden>
Date: Mon Sep 5 08:34:03 2022 +0200

    Cleanup fanout queues on ovs agent stop

    Previously when a neutron-openvswitch-agent was stopped it left
    behind the following fanout queues in rabbitmq:
    neutron-vo-Network-1.0_fanout_someuuid
    neutron-vo-Port-1.1_fanout_someuuid
    neutron-vo-SecurityGroup-1.0_fanout_someuuid
    neutron-vo-SecurityGroupRule-1.0_fanout_someuuid
    neutron-vo-SubPort-1.0_fanout_someuuid
    neutron-vo-Subnet-1.0_fanout_someuuid
    neutron-vo-Trunk-1.1_fanout_someuuid

    In this change we ensure that all but the SubPort and Trunk fanout
    queues are correctly removed from rabbitmq by cleanly stopping the
    RemoteResourceCache when the agent stops.

    Partial-Bug: #1586731
    Change-Id: I672f9414a1a8ed91e259e9379ca707a70f6b4467

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/c/openstack/neutron/+/856411
Committed: https://opendev.org/openstack/neutron/commit/2402145713bb349a9d1852b92e2b37a56f26874b
Submitter: "Zuul (22348)"
Branch: master

commit 2402145713bb349a9d1852b92e2b37a56f26874b
Author: Felix Huettner <email address hidden>
Date: Thu Sep 8 09:44:50 2022 +0200

    Cleanup fanout queues on ovs agent stop (part 2)

    As a followup from the previous commit we here now also cleanup the
    SubPort an Trunk fanout queues.

    Closes-Bug: #1586731
    Change-Id: I047603b647dec7787c2471d9edb70fa4ec599a2a

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 21.0.0.0rc1

This issue was fixed in the openstack/neutron 21.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.