trunk + subports not working

Bug #1848311 reported by do3meli
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
New
Undecided
Miguel Lavalle

Bug Description

Since upgrading from Rocky to Stein we are experiencing problems live migrating vm's with trunk ports and creating new trunk ports. The live migrations of the vm itself eventually completes but the trunk ports remain in the status "BUILD" or "DOWN". The corresponding subports and/or the parent port are mostly in status "DOWN" too. It looks like not all of the corresponding needed ports get moved from hypervisor host a to host b. Given theses status from the ports it is obvious that the VM is not accessible from the network at all.

Most of the time when the migration is about to finish we see such kind of time out messages in the neutron-openvswitch-agent log:

2019-10-14 12:28:56.559 20071 ERROR neutron_lib.rpc [-] Timeout in RPC method trunk.update_subport_bindings. Waiting for 114 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 58a64b2c975143a4bbfd07ab3b10e871
2019-10-14 12:28:56.560 20071 WARNING neutron_lib.rpc [-] Increasing timeout for trunk.update_subport_bindings calls to 240 seconds. Restart the agent to restore it to the default value.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 58a64b2c975143a4bbfd07ab3b10e871
2019-10-14 12:28:56.562 20071 ERROR neutron_lib.rpc [-] Timeout in RPC method trunk.update_subport_bindings. Waiting for 56 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID c1e5f0b50f044c8ea1f40f3e2e959fc0
2019-10-14 12:29:53.021 20071 ERROR neutron.services.trunk.drivers.openvswitch.agent.ovsdb_handler [-] Got messaging error while processing trunk bridge tbr-e4685a7d-2: Timed out waiting for a reply to message ID c1e5f0b50f044c8ea1f40f3e2e959fc0: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID c1e5f0b50f044c8ea1f40f3e2e959fc0
2019-10-14 12:30:24.896 20071 ERROR neutron_lib.rpc [req-85c86e08-52a3-4199-a1af-915f4847e9fc cd9715e9b4714bc6b4d77f15f12ba5a9 1e205eb2989a4beb9ef5947abff00b35 - - -] Timeout in RPC method trunk.update_trunk_status. Waiting for 75 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 8093a1090d47426380434e65559875e5
2019-10-14 12:30:24.896 20071 WARNING neutron_lib.rpc [req-85c86e08-52a3-4199-a1af-915f4847e9fc cd9715e9b4714bc6b4d77f15f12ba5a9 1e205eb2989a4beb9ef5947abff00b35 - - -] Increasing timeout for trunk.update_trunk_status calls to 240 seconds. Restart the agent to restore it to the default value.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 8093a1090d47426380434e65559875e5
2019-10-14 12:30:50.133 20071 ERROR neutron.services.trunk.drivers.openvswitch.agent.ovsdb_handler [-] Got messaging error while processing trunk bridge tbr-b56178af-8: Timed out waiting for a reply to message ID 58a64b2c975143a4bbfd07ab3b10e871: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 58a64b2c975143a4bbfd07ab3b10e871
2019-10-14 12:31:39.851 20071 ERROR neutron.services.trunk.drivers.openvswitch.agent.driver [req-85c86e08-52a3-4199-a1af-915f4847e9fc cd9715e9b4714bc6b4d77f15f12ba5a9 1e205eb2989a4beb9ef5947abff00b35 - - -] Error on event deleted for subports [SubPort(port_id=c048169f-a005-44a3-88e3-03a34d778bb5,segmentation_id=843,segmentation_type='vlan',trunk_id=b56178af-8d6f-4660-ac3b-cc469c3de4ce)]: Timed out waiting for a reply to message ID 8093a1090d47426380434e65559875e5: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 8093a1090d47426380434e65559875e5
2019-10-14 12:35:26.906 20071 ERROR neutron_lib.rpc [req-e7ac3037-3598-4003-90db-d59985cf5326 cd9715e9b4714bc6b4d77f15f12ba5a9 1e205eb2989a4beb9ef5947abff00b35 - - -] Timeout in RPC method trunk.update_subport_bindings. Waiting for 53 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID b16fcf72d3284439a06f8383a6d04566
2019-10-14 12:35:26.907 20071 WARNING neutron_lib.rpc [req-e7ac3037-3598-4003-90db-d59985cf5326 cd9715e9b4714bc6b4d77f15f12ba5a9 1e205eb2989a4beb9ef5947abff00b35 - - -] Increasing timeout for trunk.update_subport_bindings calls to 480 seconds. Restart the agent to restore it to the default value.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID b16fcf72d3284439a06f8383a6d04566
2019-10-14 12:36:20.366 20071 ERROR neutron.services.trunk.drivers.openvswitch.agent.driver [req-e7ac3037-3598-4003-90db-d59985cf5326 cd9715e9b4714bc6b4d77f15f12ba5a9 1e205eb2989a4beb9ef5947abff00b35 - - -] Error on event created for subports [SubPort(port_id=c048169f-a005-44a3-88e3-03a34d778bb5,segmentation_id=843,segmentation_type='vlan',trunk_id=b56178af-8d6f-4660-ac3b-cc469c3de4ce)]: Timed out waiting for a reply to message ID b16fcf72d3284439a06f8383a6d04566: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID b16fcf72d3284439a06f8383a

the os/neutron setup we have here is the following:

- 3 Controller Nodes behind HaProxy
- Ubuntu 18.04 Installation with Ubuntu Cloud Archive Repositories (Stein) (Python 3)
- Neutron ML2 Plugin with OVS Setup
- Provider Networks
- Package Version neutron-common: 2:14.0.2-0ubuntu1~cloud0
- Package Version neutron-plugin-ml2: 2:14.0.2-0ubuntu1~cloud0
- Package Version neutron-server: 2:14.0.2-0ubuntu1~cloud0
- Package Version neutron-openvswitch-agent: 2:14.0.2-0ubuntu1~cloud0
- Package Version neutron-dhcp-agent: 2:14.0.2-0ubuntu1~cloud0
- Package Version openvswitch-common: 2.11.0-0ubuntu2~cloud0
- Package Version openvswitch-switch: 2.11.0-0ubuntu2~cloud0

the port/trunk setup is as followed:

- trunk port belonging to project p1
- parent port belonging to project p1, subnet s1
- subnet s1 belongs to project p1, network n1
- network n1 belongs to project admin and has provider:segmentation_id = 700
- subport belonging to project p1, subnet s2
- subnet s2 belongs to project p1, network n2
- network n2 belongs to project p1, and has provider:segmentation_id = 843

Tags: ovs
do3meli (d-info-e)
summary: - trunk + subports not working after live migration
+ trunk + subports not working
Revision history for this message
Hongbin Lu (hongbin.lu) wrote :

Hi @Miguel,

Since you worked on live migration in before. Could you help triaging this bug?

tags: added: ovs
Changed in neutron:
assignee: nobody → Miguel Lavalle (minsel)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.