Neutron bridge_mapping not updated when unlock issued while application apply is in progress

Bug #1822396 reported by Ghada Khalil
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Frank Miller

Bug Description

Brief Description
-----------------
When deleting a data interface, the neutron bridge_mapping is not updated, resulting in the ovs-agent failing to start when the interface is modified to be a pci-passthrough interface.

Note: This issue was reported by Brent Rowsell

Severity
--------
Medium

Steps to Reproduce
------------------
On compute-3 I deleted the data1 interface on compute-3(ens785f0) and modified it to a pci-passthrough interface. After unlocking the worker, the ovs-agent failed to start.

 2019-03-29 11:09:24,283.283 6010 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-9f2f051e-e70f-4c79-bb6a-a68e5ed633fd - - - - -] Bridge br-phy1 for physical network data1 does not exist. Agent terminated!

Looking at the generated overrides, the bridge mappings did not get updated

ovs:
                    bridge_mappings: data0:br-phy0,data1:br-phy1,

Note: I performed the same action earlier on compute-2 and it worked fine.

Expected Behavior
------------------
No issues with deleting and modifying a data interface

Actual Behavior
----------------
ovs-agent failed to start after the interface was modified

Reproducibility
---------------
Intermittent. This was seen when running the operation on one compute node, but not the other.

System Configuration
--------------------
Standard

Branch/Pull Time/Commit
-----------------------
master 2019-03-11_20-18-00

Last Pass
--------------
Unknown

Timestamp/Logs
--------------
 2019-03-29 11:09:24,283.283 6010 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-9f2f051e-e70f-4c79-bb6a-a68e5ed633fd - - - - -] Bridge br-phy1 for physical network data1 does not exist. Agent terminated!

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as release gating; issue prevents changing an interface from data to pci-pt

Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
assignee: nobody → Joseph Richard (josephrichard)
tags: added: stx.2019.05 stx.containers stx.networking
Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
Ghada Khalil (gkhalil)
description: updated
Ghada Khalil (gkhalil)
tags: removed: stx.containers
Revision history for this message
Joseph Richard (josephrichard) wrote :

I just reproduced this in ironpass-7-12. Previous attempts to reproduce in wildcat_71_75 was unsuccessful.

It appears that the unlock after changing the ifclass is not triggering new override generation. I suspect that my previous attempts passed because something else triggered override generation. Once I manually do an application-apply, the correct overrides are generated.

Revision history for this message
Joseph Richard (josephrichard) wrote :

During the unlock, the stx-openstack application will only be reapplied if it is currently in the applied state. If it is currently applying, then it will skip reapplying, so will not generate the required helm overrides. See _reapply_system_app in sysinv.

That's likely why it passed for your compute-2, but then failed for your compute-3.

This is not limited to bridge mappings, and I expect would affect any changes to per-host options (e.g. nova memory config) on multiple compute nodes in sequence, where compute-0 is locked->updated-> unlocked, and then compute-1 is locked->updated->unlock while the app is still reapplying.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Adding the stx.containers label as this is a generic application apply issue; it's not specific to networking

summary: - Neutron bridge_mapping not updated after deleting a data interface
+ Neutron bridge_mapping not updated when unlock issued while application
+ apply is in progress
tags: added: stx.containers
removed: stx.networking
Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: Joseph Richard (josephrichard) → Frank Miller (sensfan22)
Revision history for this message
Frank Miller (sensfan22) wrote :

This issue will be addressed by the changes being made for https://bugs.launchpad.net/starlingx/+bug/1837750. Marking this as a duplicate.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Duplicate bug has been addressed in master & the r/stx.2.0 branch.

Changed in starlingx:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.