Comment 14 for bug 1701023

Revision history for this message
Dan Streetman (ddstreet) wrote :

The reason the if-pre-up.d/vlan script was changed to call ifup for its raw-device, was for bug 1573272, where the problem was a race condition between the VLAN and its raw-device; if the VLAN interface was processed through if-pre-up.d/vlan before the raw-device was fully configured, then in some situations (such as if the raw-device is a bond device) the raw-device is brought down during its ifup processing (usually in order to set some device params that can't be set while the interface is up). When the raw-device is taken down, the VLAN also goes down, and any routing or other associated configuration is lost, such as a default gateway. When the raw-device comes back up, the VLAN comes up as well, but the (e.g.) default gateway is not restored, since the VLAN interface does not go through ifup again.

The reason there is a possibility for a race between a raw-device and its vlan(s) is because if the /lib/udev/vlan-network-interface script. That script is run for every interface that udev detects, and the script checks each 'auto' device that ifquery lists as having configuration. Any device that lists its 'vlan-raw-device' as matching the physical interface that udev is currently processing, triggers the script to call /etc/network/if-pre-up.d/vlan to create the vlan interface. After udev is finished processing the vlan's raw-device, it passes it to systemd (or upstart) which then calls ifup for the interface.

The script's creation of the vlan interface is also, in parallel, detected by udev, which passes it to systemd/upstart also, and ifup is called for it.

Since the raw-device and the vlan(s) are ifup'ed in parallel, they race with each other. In most situations, since the raw-device has a 'headstart' in coming up, it will complete before any of the vlans, but if the raw-device's configuration steps delays it, then the vlan(s) may finish their ifup before it, possibly causing the above-mentioned problem.

The addition of the 'ifup' call to if-pre-up.d/vlan script 'fixed' this by forcing an ifup of the vlan's raw-device during (before) the creation of the vlan interface. That works when if-pre-up.d/vlan is being called from the /lib/udev/vlan-network-interface script, from the udev processing thread; however when the if-pre-up.d/vlan script is called from elsewhere, such as during a different device's ifup, it can hang ifupdown, as shown in this bug.