revert resize: vif-plugged external event sent too soon if Neutron is using OVS hybrid plug
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Medium
|
Artom Lifshitz | ||
Rocky |
Fix Committed
|
Medium
|
Artom Lifshitz | ||
Stein |
Fix Committed
|
Medium
|
Artom Lifshitz |
Bug Description
Description
===========
This is all only when Neutron is using OVS with hybrid plugging.
When reverting a resized instance back to its original source host, Nova will timeout waiting for the vif-plugged external event, and never finish the revert. This happens because the event is sent by Neutron as soon as Nova updates the port binding to point back to the original source. This happens before the virt driver gets ready to listen for external events, so the event arrives, just too soon, and Nova times out.
Steps to reproduce
==================
1. Resize an instance
2. When it's in VERIFY_RESIZE, revert it
Expected result
===============
Instance reverts correctly.
Actual result
=============
Instance goes to ERROR.
Environment
===========
OVS with hybrid plug. Reported in OSP14/Rocky [1], reproduced on master [2] [3].
[1] https:/
[2] https:/
[3] https:/
Changed in nova: | |
assignee: | nobody → Artom Lifshitz (notartom) |
status: | New → In Progress |
Changed in nova: | |
assignee: | Artom Lifshitz (notartom) → sean mooney (sean-k-mooney) |
Changed in nova: | |
assignee: | Artom Lifshitz (notartom) → sean mooney (sean-k-mooney) |
Changed in nova: | |
assignee: | sean mooney (sean-k-mooney) → Artom Lifshitz (notartom) |
triaging as medium as this can technically as this race conditon can technically be worked around by
either using the default ovs contrack security group driver instead of the legacy ip tables firewall or
by disableling waiting for vif plugged event.
otherwise i would set this as high.