failed to generate config when interface was renamed
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
cloud-init |
Fix Released
|
High
|
Unassigned |
Bug Description
2022-08-03 18:42:31,598 - util.py[DEBUG]: Writing to /etc/netplan/
2022-08-03 18:42:31,598 - subp.py[DEBUG]: Running command ['netplan', 'generate'] with allowed return codes [0] (shell=False, capture=True)
2022-08-03 18:42:31,875 - subp.py[DEBUG]: Running command ['udevadm', 'test-builtin', 'net_setup_link', '/sys/class/
2022-08-03 18:42:31,880 - subp.py[DEBUG]: Running command ['udevadm', 'test-builtin', 'net_setup_link', '/sys/class/
2022-08-03 18:42:31,956 - subp.py[DEBUG]: Running command ['udevadm', 'test-builtin', 'net_setup_link', '/sys/class/
2022-08-03 18:42:31,959 - util.py[WARNING]: failed stage init-local
2022-08-03 18:42:31,959 - util.py[DEBUG]: failed stage init-local
Traceback (most recent call last):
File "/usr/lib/
ret = functor(name, args)
File "/usr/lib/
init.
File "/usr/lib/
return self.distro.
File "/usr/lib/
self.
File "/usr/lib/
return super()
File "/usr/lib/
renderer.
File "/usr/lib/
self.
File "/usr/lib/
subp.subp(cmd, capture=True)
File "/usr/lib/
raise ProcessExecutio
cloudinit.
Command: ['udevadm', 'test-builtin', 'net_setup_link', '/sys/class/
Exit code: 1
Reason: -
Stdout:
Stderr: Load module index
Parsed configuration file /usr/lib/
Parsed configuration file /usr/lib/
Parsed configuration file /run/systemd/
Parsed configuration file /run/systemd/
Parsed configuration file /run/systemd/
Parsed configuration file /run/systemd/
Created link configuration context.
Failed to open device '/sys/class/
Unload module index
Unloaded link configuration context.
Thanks @ChrisPatterson for continuing to help us out here on big systems.
Looks like a case where the network rename by the kernel is colliding with cloud-init.
I'm thinking the failure symptom is the following: STALE_DEVICE_ NAME>
- cloud-init calls get_devicelist and looping starts looping through devices found [1]
- kernel renames some nic and sysfs gets updated
- cloud-init is unable to finish the loop of calls to 'udevadm', 'test-builtin', 'net_setup_link', <PREVIOUS/
We need to better handle this potential race condition in cloud-init and vet whether a rename happened out from under us, or block the renames in the kernel temporarily if we can.
References:
[1] https:/ /github. com/canonical/ cloud-init/ blob/main/ cloudinit/ net/netplan. py#L279- L284