Comment 2 for bug 1901707

Revision history for this message
Tobias Urdin (tobias-urdin) wrote :

I don't have a way to test this with any other version than Train right now, this was not an issue on CentOS 7 with Train but when we moved to CentOS 8 with Train this started happening.

What I understand from Sean's input is that the behavior has changed in Neutron, before Neutron would allow two ports to be active so the new port on the compute node would already be ready but now with multiple bindings feature that is not the case anymore.

It's the plugging in openvswitch that is the issue, the port managed by neutron's openvswitch-agent.

IMO there should be an event sent to Nova when the port is fully ready so that Nova could do the live migration after that, but given that the behavior has changed in Neutron maybe it's no longer possible or
allowed to have two ports configured and active.

I can reproduce this 100% of the time with the versions mentioned, the other bug is primarily about another bug which is when the openvswitch firewall driver is used, this is when iptables_hybrid is used but that doesn't seem to be the cause of the issue either way.

I don't have a good way to go about it, since if Sean's comment about it being a behavior change in Neutron that might not be able to workaround there isn't much Nova can do. This pretty much breaks the whole purpose of live-migration since we need to carry a custom patch in Nova that makes the VM send out new RARP frames AFTER the live migration (data plane is therefore dependent on the timings of the control plane running the post_live_migration action in Nova) so we are taking a hit with some second(s) of downtime extra.