Comment 43 for bug 1805920

Revision history for this message
Vern Hart (vern) wrote :

It seems an important component to the failure scenario is the hardware. The customer equipment is a Cisco UCS chassis and the MAAS nodes are blades in that chassis. Even though we cannot find anything in configuration that specifically adds the vlan-0 tag (or priority tag), traffic between the blades goes out one node untagged and shows up tagged on the other node.

Some bugs/discussions around vlan-0 and UCS:

  https://quickview.cloudapps.cisco.com/quickview/bug/CSCuu29425
  https://quickview.cloudapps.cisco.com/quickview/bug/CSCuz83183
  https://bugs.launchpad.net/opencontrail/+bug/1457805
  https://arstechnica.com/civis/viewtopic.php?f=10&t=1442797
  https://lists.linuxfoundation.org/pipermail/fds-dev/2017-May/000710.html
  http://lists.openstack.org/pipermail/openstack-operators/2013-April/002777.html
  https://linux.oracle.com/pls/apex/f?p=102:2:::NO::P2_VC_ID,P2_VERSION:606,1.0

As a note, Cisco seems to suggest it's a bug in Linux, citing these two old posts:

  https://lists.openwall.net/netdev/2013/09/10/30
  https://lists.linuxfoundation.org/pipermail/bridge/2015-July/009630.html

But I'm not convinced they are valid since this vlan-0 tag problem only shows up with this specific Cisco hardware. It seems like there are multiple network related software projects (like ipxe, vpp, probably others) that are forced to deal with the special case of vlan 0 (priority tagging) being added by Cisco UCS switches because Cisco's stance is that they're not adding the tags.