too many l3 dvr agents got notifications after a server got deleted

Bug #1968837 reported by norman shen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Expired
Undecided
Unassigned

Bug Description

We are using Rocky 13.0.6 neutron which seems removing router namespace if retry limit got hit.

After some investigations, it seems that delete a server which already associates with a floating ip address
seems causes a broadcast notification to all related routers. In our cases, we have around 300 compute nodes and they all have l3 dvr agents running on.

the related code snippet is https://github.com/openstack/neutron/blob/bb4c26eb7245465bf7cea7e0f07342601eb78ede/neutron/db/l3_db.py#L1999, so my question is: is it still relevant to have it if dvr is enabled?

Revision history for this message
Jakub Libosvar (libosvar) wrote :

I'm not sure I understand what the problem is. The qrouter namespace is deleted even on a compute where the router hosts FIP? Can you provide some logs with the retry limit?

Changed in neutron:
status: New → Incomplete
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello:

I think I understand the problem. When a VM is deleted, the port created by Nova is deleted too (I'll consider this VM has a single port only). That will trigger the RPC call [1], that will send a "routers_updated" method via the message queue. This message will arrive to all L3 agents subscribed (in this case, the 300 compute node L3 agents).

However, for L3 with DVR, we have another method that handles the DVR router deletion if needed. This method sends a "_notification_host" message for each router to be deleted; this method sends a compute node specific message instead of a topic broadcast message.

We can do something similar in this case: if the VM port to be deleted has a FIP associated, send a host specific message to disassociate the FIP.

However, I'm not aware of the side effects of, in this case, skipping the [1] message.

Regards.

[1]https://github.com/openstack/neutron/blob/bb4c26eb7245465bf7cea7e0f07342601eb78ede/neutron/db/l3_db.py#L1999
[2]https://github.com/openstack/neutron/blob/bb4c26eb7245465bf7cea7e0f07342601eb78ede/neutron/db/l3_dvrscheduler_db.py#L550-L568

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for neutron because there has been no activity for 60 days.]

Changed in neutron:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.