Comment 7 for bug 1808171

Revision history for this message
sean mooney (sean-k-mooney) wrote :

you know what looking at this again.
assuming what normal is say regarding libvirt is corerct.
about libvirt requesting the port very shortly after os-vif does.

the issue in the neutron agent could be similar to this
https://bugs.launchpad.net/neutron/+bug/1807239
we might have a similar race where
1.) os-vif creates teh ovs port
2.a)neutron detects it and starts to wire it up
2.b) libvirt in parallel recreates the port with "'--if-exists', 'del-port', dev, '--',
            'add-port', bridge, dev, ..."
3 neutron sees the port be removed and readed

while libvirt is replugging the interface the neutron agent could obseve the vswitch state between
the del-port and add-port leading to
http://logs.openstack.org/32/602432/3/check/tempest-full/4b76af0/controller/logs/screen-q-agt.txt.gz#_Jan_16_21_35_54_815358 where it say the port is missing even though
before and after that command runs the port would be present on teh bridge.

executing a since ovs-vsctl command with multiple sub commands like
"'--if-exists', 'del-port', dev, '--',
            'add-port', bridge, dev, ..."
used to be atomic in ovs but at some time that chagned and each commad is not executed indevidually
out side of a transaction. meaning that the port will actully be removed and readded which is why we change changed to add-port --may-exist in os-vif instead of --if-exits del-port which libvirt is apparently using.

if this is the root casue i would have expected https://review.openstack.org/#/c/602432/ to solve it as libvirt will no long interact with ovs but since i am seeing this bug with that patch applied also i think there is more to this then a potential race.