subiquity generates netplan that has PCI ordering sensitivity (bad move)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
subiquity (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
Using Ubuntu server, With details reported by inxi as:
$ inxi -Fz
System:
Kernel: 6.8.0-59-generic arch: x86_64 bits: 64
Console: pty pts/0 Distro: Ubuntu 24.04.1 LTS (Noble Numbat)
Machine:
Type: Desktop System: Dell product: Vostro 420 Series v: N/A serial: <superuser required>
Mobo: Dell model: 0N185P v: A02 serial: <superuser required> BIOS: Dell v: 1.1.4
date: 04/17/2009
CPU:
Info: quad core model: Intel Core2 Quad Q8200 bits: 64 type: MCP cache: L2: 4 MiB
Speed (MHz): avg: 2078 min/max: 1995/2328 cores: 1: 2328 2: 1995 3: 1995 4: 1995
Graphics:
Device-1: Intel 4 Series Integrated Graphics driver: i915 v: kernel
Display: server: No display server data found. Headless machine? tty: 225x106
resolution: 1440x900
API: EGL v: 1.5 drivers: swrast platforms: surfaceless,device
API: OpenGL v: 4.5 vendor: mesa v: 24.2.8-
renderer: llvmpipe (LLVM 19.1.1 128 bits)
Audio:
Message: No device data found.
Network:
Device-1: Realtek RTL8111/
IF: enp3s0 state: up speed: 100 Mbps duplex: full mac: <filter>
RAID:
Hardware-1: Intel SATA Controller [RAID mode] driver: ahci
Device-1: md126 type: mdraid level: mirror status: active size: 1.82 TiB report: 2/2 UU
Components: Online: 0: sdb 1: sdc
Device-2: md127 type: mdraid level: N/A status: inactive size: N/A report: N/A
Components: Online: N/A Spare: 0: sdb 1: sdc
Drives:
Local Storage: total: raw: 4.09 TiB usable: -1465131160 used: 543.71 GiB
ID-1: /dev/sda vendor: Toshiba model: DT01ACA050 size: 465.76 GiB
ID-2: /dev/sdb vendor: Seagate model: ST2000VN004-2E4164 size: 1.82 TiB
ID-3: /dev/sdc vendor: Seagate model: ST2000VN004-2E4164 size: 1.82 TiB
Partition:
ID-1: / size: 465.76 GiB used: 52.13 GiB (11.2%) fs: btrfs dev: /dev/sda2
Swap:
Alert: No swap data was found.
Sensors:
System Temperatures: cpu: 43.0 C mobo: N/A
Fan Speeds (rpm): cpu: 1662 mobo: 745
Info:
Memory: total: 6 GiB available: 5.75 GiB used: 1.32 GiB (23.0%)
Processes: 235 Uptime: 14h 52m Init: systemd target: graphical (5) Shell: Bash inxi: 3.3.34
I removed a GeForce graphic (PCI) card (not needed using this box in server context) and rebooted. The network did not come up.
Traced it to this:
$ cat /etc/netplan/
# This is the network config written by 'subiquity'
network:
ethernets:
enp4s0:
dhcp4: true
version: 2
But `ip link` reported an interface `enp3s0` that was DOWN. I could bring it up and all was good.
So I fixed the netplan to:
$ cat /etc/netplan/
# This is the network config written by 'subiquity'
network:
ethernets:
enp3s0:
dhcp4: true
version: 2
and rebooted and all was good. System is now fine.
Conclusion. The kernel is naming the ethernet device based on some hardware ordering. Removing a PCI card saw the device name change.
Given the time this cost me to diagnose, find and fix, and that it boiled down to a perceived sensitivity of the OS configuration, to hardware cards, that can be inserted or removed, that can cause it to fail - in this case to go online and hence be reachable, so had to boot on a console) and that I believe it's possible to write a netplan more robustly, I considered this a bug worth fixing (low priority).
My (provisional) conclusion: subiquity should write a netplan that is robust against PCI card insertion/removal (and associated device name changes).