Comment 31 for bug 1701023

Revision history for this message
Tom Verdaat (tom-verdaat) wrote :

Hi @ddstreet. Completely understand your need to limit the scope of this bug. Just shared our findings, but feel free to ignore the stuff in #28 under item 2. We did a lot of extensive testing over the weekend with the latest version of your PPA package and here are our main findings:

1) We migrated from separate files in /etc/networking/interfaces.d to just declaring everything in the single /etc/networking/interfaces file. This overcomes a lot of issues with regards to bringing interfaces up in the proper order and "ifup -a" now works perfectly again. Some lessons learned for future reference: (a) to have bonds come up correctly you absolutely have to define slaves before the bond master and the primary slave before secondary slaves in the configuration file, and (b) to have a vlan come up correctly define its raw device before the vlan device.

2) Even though "ifup -a" now works again, bringing bonds up correctly at boot does not. Pretty sure this has to do with the raw interfaces being detected by the kernel and brought up by systemd in a different order at boot. As said under (1) the order really really matters. Bringing up a secondary slave before the primary slave seems to break the bond (looks like due to using the wrong MAC address) and it looks like this is what sometimes happens at boot. Our workaround mentioned in #28 under (3) mitigates this, but it's not very elegant at all.

3) There is a problem when running a bond on top of vlans. Running ifup with verbose enabled shows run-parts being executed in (what seems like) alphabetical order, but to enslave a vlan interface, run-part /etc/network/if-pre-up.d/vlan should be executed before run-part /etc/network/if-pre-up.d/ifenslave. For now we added "pre-up export IFACE=<name> IF_VLAN_RAW_DEVICE=<raw device name>; /etc/network/if-pre-up.d/vlan" to all vlan slaves as a workaround, but it would be better to fix this in the ifupdown package itself.