Comment 10 for bug 1915282

Revision history for this message
sean mooney (sean-k-mooney) wrote :

i spoke to melanie about this on a private irc today.
i did actully see comes 6-9 as they came in and melwitt was right that i unfortunetly
could not find time to respond to this because fo FF and the VDPA feature i was working on.
i did however make a few attepmets so scope the possibel impact but did not have time to comment on them.

as noted above the only in tree neturon ml2 driver that will allow PF + tunnels is ovn
and we know that that by desingn cannot work correctly so we should block that in the ovn ml2 driver as part of adressing this bug.

i also use https://codesearch.opendev.org/ to quickly search for all ml2 driver that support PFs and then i tried to see if any also support vxlan. from wahat i recall a few weeks ago wehn i did that i found none. so while its possible that there is an ml2 driver out there that implemnt hieracical port binidng and support tunnel termination at the top of rack switch similar to how the arrista network mech driver can do it for VF i have not found any that do so for PFs.

so if we block tunnels+PF i dont known of any configurtion today that we woudl break.

to be safe we could block it by default and add a new workarounds config option with a nice an scary name i.e. [workarounds]/enable_unsafe_pf_tunnelling, which we could deprecate imediatly.
if we can then keep it for a cycle or two and remove it if no operator objects.

while working on the VDPA work i also in adversely tested part of this code path.
VDPA uses VFs not PFs but i was able to whitelist the VF with physical_network: null
and was able to assert that we can indeed bind the VF ports (technial i used vdpa ports) to a tunneled network. on the nova side i also observed that the pcipassthough filter could actuly tell the difference between null and "physnet1", i didn expressly test "null" but when i was printing some extra debug logs null form the db was not quoted as i previously thought so we shoudl be able to tell them appart at all points.

as a result we can tell the difference between someone declaring the network as for use with tunnels with an unquoted null and someone who happend to chosee "null"as there neuton physnet name for some other reason. as a result we can explictly doument the reseved value null as being a sentil resrved for allocating pci device for use with tunneled networking.

for VF + tunnels on check we probably shoudl do is check if the sriov device has switchdev capablity. we could then either block it or warn if it not a switchdev enable vf.

in anycase i thikn we have a path forward both in terms of nova and neutron action and we can likely move this to a public bug so that we can use the normal reveiew process to track the work around both projects.