openvswitch vswitchd restart causing fip and qrouter ports delete

Bug #2023027 reported by Yusuf Güngör
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
New
Undecided
Unassigned

Bug Description

Hi everyone, on a compute node restarting the openvswitch vswitchd causing fip and qrouter netns ports deletion.

The deleted ports created again after l3 agent restart.

This behaviour causes the network outgage while the upgrade operation

Neutron-OVS logs and fip, qrouter netns interfaces before-after ovs-l3 agent restart operation attached to issue.

Environment Details:
 Openstack Version: Wallaby (cluster installed via kolla-ansible)
 OS Version: Ubuntu 20.04.2 LTS Hosts. (Kernel:5.4.0-90-generic)
 Neutron Version: 18.1.2.dev118 ["neutron-server", "neutron-dhcp-agent", "neutron-openvswitch-agent", "neutron-l3-agent", "neutron-bgp-dragent", "neutron-metadata-agent"]
 There exist 5 controller+network node.
 OpenvSwitch used in DVR mode and router HA is disabled. (l3_ha = false)
 We are using a single centralized neutron router for connecting all tenant networks to provider network.
 We are using bgp_dragent to announce unique tenant networks.
 Tenant network type: vxlan
 External network type: vlan

Revision history for this message
Yusuf Güngör (yusuf2) wrote :
Revision history for this message
Yusuf Güngör (yusuf2) wrote (last edit ):

After Xena Upgrade it is self resolving. No more l3 agent restart is required.

The resolve times:

30 seconds for a compute which has 3 instances (kernel: 5.4.0-150-generic)
2 seconds for a compute which has 6 instances. (kernel: 5.4.0-90-generic)

ovs-vswitchd logs are attached.

Environment Details:
 Openstack Version: Xena (cluster installed via kolla-ansible)
 OS Version: Ubuntu 20.04.2 LTS Hosts. (Controller+Network Node Kernel:5.4.0-90-generic | Compute Nodes: Mixed 5.4.0-90-generic and 5.4.0-150-generic)
 Neutron Version: 19.4.1.dev106 ["neutron-server", "neutron-dhcp-agent", "neutron-openvswitch-agent", "neutron-l3-agent", "neutron-bgp-dragent", "neutron-metadata-agent"]
 There exist 5 controller+network node.
 OpenvSwitch used in DVR mode and router HA is disabled. (l3_ha = false)
 We are using a single centralized neutron router for connecting all tenant networks to provider network.
 We are using bgp_dragent to announce unique tenant networks.
 Tenant network type: vxlan
 External network type: vlan

Revision history for this message
LIU Yulong (dragon889) wrote :

After restart ovs-vswitchd, the ports should still exist on the ovs bridges. So, you may show us the output of "ovs-vsctl show". All the ports of fip and qrouter should still be printed.

And, I guess, this looks like a container issue, the devices (qr-dev or fg-dev) are not added to namespace after restart the vswitchd container. When you restart the L3-agent, the devices will be reprocessed. So you can see it back to right namespace.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.