node failed to deploy because an ephemeral network device was not found
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Invalid
|
Undecided
|
Unassigned | ||
cloud-init |
Expired
|
Undecided
|
Unassigned |
Bug Description
Hi,
Using MAAS snap 2.8.6-8602-
I just had a node failed to deploy because a network device that was present during commissioning wasn't present anymore, making cloud-init sad. To be precise, the node deployed properly, rebooted, and during the post-deploy boot, cloud-init got sad with :
RuntimeError: Not all expected physical devices present: {'be:65:
(full stacktrace at https:/
I was indeed able to find the network device with MAC address 'be:65:
I deleted this ephemeral device from the node in MAAS, and was then able to deploy it properly.
These ephemeral NICs appear to have random MAC addresses. I was logged on the HTML5 console during the boot logged above, and you can see there's a device named "enx5a099ca01d4b" with MAC address "5a:09:9c:a0:1d:4b" (which doesn't match a known OUI).
This is actually a cdc_ether device :
$ dmesg|grep cdc_ether
[ 29.867170] cdc_ether 1-1.3:2.0 usb0: register 'cdc_ether' at usb-0000:
[ 29.867296] usbcore: registered new interface driver cdc_ether
[ 29.958137] cdc_ether 1-1.3:2.0 enx5a099ca01d4b: renamed from usb0
[ 205.908811] cdc_ether 1-1.3:2.0 enx5a099ca01d4b: unregister 'cdc_ether' usb-0000:
(the last time is very probably when I logged off the HTML5 console, which removes the device).
So I think :
- MAAS should ignore these devices by default
- cloud-init shouldn't die when a cdc_ether device is missing.
Thanks
Changed in cloud-init: | |
status: | New → Opinion |
Without all of the cloud-init logs, it's hard to know what exactly happened here. cloud-init couldn't find a datasource, which normally wouldn't be related to a missing network device.
Is this issue reproducible? If so, running 'cloud-init collect-logs -u' on an affected instance might help us debug this issue.