DNS busted on system that commissioned w/ a NIC MAC of all 0's
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Triaged
|
Low
|
Unassigned | ||
Netplan |
Triaged
|
Medium
|
Unassigned | ||
curtin |
New
|
Undecided
|
Unassigned |
Bug Description
We had a weird issue where a new system we deployed with MAAS was unusable because it couldn't resolve DNS names. Ultimately the problem was that netplan was configured to set a MAC of all 0's to a USB NiC, and that messes up the loopback interface (lo).
Here's a cut & paste of my diagnosis notes. I believe it to be related to, but different, than bug 1936972.
I was verifying an SRU and hit what I think is the same problem. DNS wasn’t working - but networking generally was. I finally noticed this:
ubuntu@hinyari:~$ ip addr
1: usb0: <LOOPBACK> mtu 65536 qdisc noqueue state DOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host usb0
valid_lft forever preferred_lft forever
2: enP6p1s0f0np0: <NO-CARRIER,
link/ether b8:3f:d2:1d:37:40 brd ff:ff:ff:ff:ff:ff
3: enP6p1s0f1np1: <BROADCAST,
link/ether b8:3f:d2:1d:37:41 brd ff:ff:ff:ff:ff:ff
inet 10.229.100.0/16 brd 10.229.255.255 scope global enP6p1s0f1np1
valid_lft forever preferred_lft forever
inet6 fe80::ba3f:
valid_lft forever preferred_lft forever
4: enx9699ad470dd1: <BROADCAST,
link/ether 96:99:ad:47:0d:d1 brd ff:ff:ff:ff:ff:ff
There is no lo. Rather, there’s a usb0 device with a MAC of all 0’s. I’m guessing this is the host redfish interface, which often has a NULL mac in hardware, and the kernel generates one randomly on boot.
The netplan config has this entry:
usb0:
match:
mtu: 1500
version: 2
So I’m guessing what is happening is that netplan is deciding that it should rename the device that has a MAC of all 0’s to usb0. The device called usb0 is the loopback device - the USB device is actually called enx9699ad470dd1. Presumably it later got a random MAC assigned by the kernel.
This presumably breaks DNS because the system is configured to use 127.0.0.53 as the DNS server. systemd-resolved should be listening there and forwarding requests to the real DNS server. But I suspect systemd-resolved is trying to bind to lo, which doesn’t exist.
Perhaps MAAS, cloud-init or curtin or whatever creates the netplan config should be told not to do this. Perhaps cdc-ether shouldn’t expose a 0 MAC to userspace before it generates a random one. Or perhaps each of these pieces should notice all 0's MACs and handle them specially.
Changed in maas: | |
status: | New → Triaged |
importance: | Undecided → Low |
milestone: | none → 3.6.0 |
This is interesting. Tagging it for our backlog. Can you please specify what version of Ubuntu and Netplan are being used here?