Networking failures after NIC reordering
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Netplan |
Incomplete
|
Undecided
|
Unassigned | ||
cloud-init |
Expired
|
High
|
Unassigned |
Bug Description
We can reliably reproduce a case where network configuration changes for an Ubuntu 20.04 VM results in a networkd hanging on "pending" interfaces. The interfaces are pending because of conflicts in naming from the current boot and that found in /etc/netplan/
Specifically, the netplan generator applies the previous configuration's names prior to running cloud-init local. We'll see something like `systemd-
eth0: Failed to process device, ignoring: File exists`.
In one scenario, the data source is able to fetch updated network configuration, and cloud-init updates the config & udev rules just fine. However, networking stays offline ("pending") indefinitely. It can be forced to resolve by executing `sudo udevadm trigger --attr-
Example: Create a VM on Azure with two NICs, re-order them, then restart.
az vm create --name test-x1 --image Canonical:
az vm deallocate --name test-x1
az vm nics set --vm-name test-x1 --nics test-nic-02 test-nic-01
az vm start --name test-x1
Upon doing that I am unable to login via serial console for 20 minutes until cloud init times out. In this case, Azure is trying to report ready but cannot because system networking never came up. We can remove /lib/systemd/
The behavior for 18.04 is a bit different. On 18.04, the renaming of the interfaces succeeds at early boot, which instead results in the Azure data source failing the local phase because the fallback_interface is no longer the primary NIC (eth1 secondary was renamed to eth0 to match previous boot's config).
Changed in netplan: | |
status: | New → Incomplete |
Thanks for the thorough bug report. I have confirmed the 20.04 behavior and the root cause.