Comment 15 for bug 1805920

Revision history for this message
Mike Pontillo (mpontillo) wrote :

Interesting developments. I agree that it doesn't seem like `bridge-nf-filter-vlan-tagged` is what we want, unless there is a special case not filter packets tagged on VID 0. It might be worth trying this out on the bridge used to boot the pod VMs.

The frustrating thing about this bug report is the large amount of layers we would need to check for correct behavior. We are trying to verify all of the following points of contact with a "VLAN 0" tagged packet at once:

 - The network infrastructure
   (it would be better for VLAN 0 tags to be stripped before reaching end nodes)
 - NIC hardware (some NICs handle VLAN filtering in hardware; I'm not sure if they could automatically discard the VLAN 0 tag before it's handed up the stack)
 - Linux NIC driver (would be programming the aforementioned hardware to hopefully do the right thing - or not)
 - Linux bridge driver (or whatever is handing the packets from the OS to the pod)
 - Hypervisor NIC driver (that is, the virtual hardware which will be booting on the VM)
 - iPXE (which would be using a minimal network stack prior to booting the virtual OS)
 - Virtual OS Linux driver

I feel like we aren't sure which of these layers might have a problem handling packets tagged on VID 0. We know iPXE has a problem for sure; the other layers would need to be tested in isolation to verify the behavior.

Andres confirmed that RHEL/CentOS are patching iPXE to treat VLAN 0 frames as untagged (his link showed that the patch was on CentOS 7). But there are many layers here that could be handling them improperly. That is, if we know that CentOS 6 works, we can't rule out that CentOS 7 may not be working due to an issue elsewhere in the stack (not in iPXE).