Activity log for bug #1902960

Date Who What changed Old value New value Message
2020-11-04 22:49:51 David Lawson bug added bug
2020-11-04 22:50:02 David Lawson bug added subscriber The Canonical Sysadmins
2020-11-04 23:02:28 David Lawson tags apport-collected focal uec-images
2020-11-04 23:02:29 David Lawson description The systemd upgrade 245.4-4ubuntu3.3 to 245.4-4ubuntu3.2 appears to have broken DNS resolution across much of our Azure fleet earlier today. We ended up mitigating this by forcing reboots on the associated instances, no combination of networkctl reload, reconfigure, systemctl daemon-reexec, systemctl daemon-reload, netplan generate, netplan apply would get resolvectl to have a DNS server again. The main symptom appears to have been systemd-networkd believing it wasn't managing the eth0 interfaces: ubuntu@machine-1:~$ sudo networkctl IDX LINK TYPE OPERATIONAL SETUP 1 lo loopback carrier unmanaged 2 eth0 ether routable unmanaged 2 links listed. Which eventually made them lose their DNS resolvers: ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): After rebooting, we see this behaving properly: ubuntu@machine-1:~$ sudo networkctl list IDX LINK TYPE OPERATIONAL SETUP 1 lo loopback carrier unmanaged 2 eth0 ether routable configured 2 links listed. ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): 168.63.129.16 This appears to be specifically linked to the upgrade, i.e. we were able to provoke the issue by upgrading the systemd package, so I suspect it's part of the packaging in the upgrade process. The systemd upgrade 245.4-4ubuntu3.3 to 245.4-4ubuntu3.2 appears to have broken DNS resolution across much of our Azure fleet earlier today. We ended up mitigating this by forcing reboots on the associated instances, no combination of networkctl reload, reconfigure, systemctl daemon-reexec, systemctl daemon-reload, netplan generate, netplan apply would get resolvectl to have a DNS server again. The main symptom appears to have been systemd-networkd believing it wasn't managing the eth0 interfaces: ubuntu@machine-1:~$ sudo networkctl IDX LINK TYPE OPERATIONAL SETUP 1 lo loopback carrier unmanaged 2 eth0 ether routable unmanaged 2 links listed. Which eventually made them lose their DNS resolvers: ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): After rebooting, we see this behaving properly: ubuntu@machine-1:~$ sudo networkctl list IDX LINK TYPE OPERATIONAL SETUP 1 lo loopback carrier unmanaged 2 eth0 ether routable configured 2 links listed. ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): 168.63.129.16 This appears to be specifically linked to the upgrade, i.e. we were able to provoke the issue by upgrading the systemd package, so I suspect it's part of the packaging in the upgrade process. --- ProblemType: Bug ApportVersion: 2.20.11-0ubuntu27.10 Architecture: amd64 CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 Lspci-vt: -[0000:00]-+-00.0 Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (AGP disabled) +-07.0 Intel Corporation 82371AB/EB/MB PIIX4 ISA +-07.1 Intel Corporation 82371AB/EB/MB PIIX4 IDE +-07.3 Intel Corporation 82371AB/EB/MB PIIX4 ACPI \-08.0 Microsoft Corporation Hyper-V virtual VGA Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: systemd 245.4-4ubuntu3.3 PackageArchitecture: amd64 ProcEnviron: TERM=xterm-256color PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1031-azure root=PARTUUID=2e08bba3-68b4-4a16-af3b-47b73bd138a9 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 ProcVersionSignature: Ubuntu 5.4.0-1031.32-azure 5.4.65 Tags: focal uec-images Uname: Linux 5.4.0-1031-azure x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 12/07/2018 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: 090008 dmi.board.name: Virtual Machine dmi.board.vendor: Microsoft Corporation dmi.board.version: 7.0 dmi.chassis.asset.tag: 7783-7084-3265-9085-8269-3286-77 dmi.chassis.type: 3 dmi.chassis.vendor: Microsoft Corporation dmi.chassis.version: 7.0 dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr090008:bd12/07/2018:svnMicrosoftCorporation:pnVirtualMachine:pvr7.0:rvnMicrosoftCorporation:rnVirtualMachine:rvr7.0:cvnMicrosoftCorporation:ct3:cvr7.0: dmi.product.name: Virtual Machine dmi.product.uuid: 4412ad79-83fa-f845-b7c2-6f30dd4f1950 dmi.product.version: 7.0 dmi.sys.vendor: Microsoft Corporation
2020-11-04 23:02:30 David Lawson attachment added CurrentDmesg.txt https://bugs.launchpad.net/bugs/1902960/+attachment/5431244/+files/CurrentDmesg.txt
2020-11-04 23:02:31 David Lawson attachment added Dependencies.txt https://bugs.launchpad.net/bugs/1902960/+attachment/5431245/+files/Dependencies.txt
2020-11-04 23:02:33 David Lawson attachment added Lspci.txt https://bugs.launchpad.net/bugs/1902960/+attachment/5431246/+files/Lspci.txt
2020-11-04 23:02:35 David Lawson attachment added ProcCpuinfo.txt https://bugs.launchpad.net/bugs/1902960/+attachment/5431247/+files/ProcCpuinfo.txt
2020-11-04 23:02:36 David Lawson attachment added ProcCpuinfoMinimal.txt https://bugs.launchpad.net/bugs/1902960/+attachment/5431248/+files/ProcCpuinfoMinimal.txt
2020-11-04 23:02:37 David Lawson attachment added ProcInterrupts.txt https://bugs.launchpad.net/bugs/1902960/+attachment/5431249/+files/ProcInterrupts.txt
2020-11-04 23:02:39 David Lawson attachment added ProcModules.txt https://bugs.launchpad.net/bugs/1902960/+attachment/5431250/+files/ProcModules.txt
2020-11-04 23:02:40 David Lawson attachment added SystemdDelta.txt https://bugs.launchpad.net/bugs/1902960/+attachment/5431251/+files/SystemdDelta.txt
2020-11-04 23:02:42 David Lawson attachment added UdevDb.txt https://bugs.launchpad.net/bugs/1902960/+attachment/5431252/+files/UdevDb.txt
2020-11-04 23:02:44 David Lawson attachment added acpidump.txt https://bugs.launchpad.net/bugs/1902960/+attachment/5431253/+files/acpidump.txt
2020-11-04 23:06:52 David Lawson description The systemd upgrade 245.4-4ubuntu3.3 to 245.4-4ubuntu3.2 appears to have broken DNS resolution across much of our Azure fleet earlier today. We ended up mitigating this by forcing reboots on the associated instances, no combination of networkctl reload, reconfigure, systemctl daemon-reexec, systemctl daemon-reload, netplan generate, netplan apply would get resolvectl to have a DNS server again. The main symptom appears to have been systemd-networkd believing it wasn't managing the eth0 interfaces: ubuntu@machine-1:~$ sudo networkctl IDX LINK TYPE OPERATIONAL SETUP 1 lo loopback carrier unmanaged 2 eth0 ether routable unmanaged 2 links listed. Which eventually made them lose their DNS resolvers: ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): After rebooting, we see this behaving properly: ubuntu@machine-1:~$ sudo networkctl list IDX LINK TYPE OPERATIONAL SETUP 1 lo loopback carrier unmanaged 2 eth0 ether routable configured 2 links listed. ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): 168.63.129.16 This appears to be specifically linked to the upgrade, i.e. we were able to provoke the issue by upgrading the systemd package, so I suspect it's part of the packaging in the upgrade process. --- ProblemType: Bug ApportVersion: 2.20.11-0ubuntu27.10 Architecture: amd64 CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 Lspci-vt: -[0000:00]-+-00.0 Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (AGP disabled) +-07.0 Intel Corporation 82371AB/EB/MB PIIX4 ISA +-07.1 Intel Corporation 82371AB/EB/MB PIIX4 IDE +-07.3 Intel Corporation 82371AB/EB/MB PIIX4 ACPI \-08.0 Microsoft Corporation Hyper-V virtual VGA Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: systemd 245.4-4ubuntu3.3 PackageArchitecture: amd64 ProcEnviron: TERM=xterm-256color PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1031-azure root=PARTUUID=2e08bba3-68b4-4a16-af3b-47b73bd138a9 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 ProcVersionSignature: Ubuntu 5.4.0-1031.32-azure 5.4.65 Tags: focal uec-images Uname: Linux 5.4.0-1031-azure x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 12/07/2018 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: 090008 dmi.board.name: Virtual Machine dmi.board.vendor: Microsoft Corporation dmi.board.version: 7.0 dmi.chassis.asset.tag: 7783-7084-3265-9085-8269-3286-77 dmi.chassis.type: 3 dmi.chassis.vendor: Microsoft Corporation dmi.chassis.version: 7.0 dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr090008:bd12/07/2018:svnMicrosoftCorporation:pnVirtualMachine:pvr7.0:rvnMicrosoftCorporation:rnVirtualMachine:rvr7.0:cvnMicrosoftCorporation:ct3:cvr7.0: dmi.product.name: Virtual Machine dmi.product.uuid: 4412ad79-83fa-f845-b7c2-6f30dd4f1950 dmi.product.version: 7.0 dmi.sys.vendor: Microsoft Corporation The systemd upgrade 245.4-4ubuntu3.3 to 245.4-4ubuntu3.2 appears to have broken DNS resolution across much of our Azure fleet earlier today. We ended up mitigating this by forcing reboots on the associated instances, no combination of networkctl reload, reconfigure, systemctl daemon-reexec, systemctl daemon-reload, netplan generate, netplan apply would get resolvectl to have a DNS server again. The main symptom appears to have been systemd-networkd believing it wasn't managing the eth0 interfaces: ubuntu@machine-1:~$ sudo networkctl IDX LINK TYPE OPERATIONAL SETUP   1 lo loopback carrier unmanaged   2 eth0 ether routable unmanaged                                                                           2 links listed. Which eventually made them lose their DNS resolvers: ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): After rebooting, we see this behaving properly: ubuntu@machine-1:~$ sudo networkctl list IDX LINK TYPE OPERATIONAL SETUP   1 lo loopback carrier unmanaged   2 eth0 ether routable configured 2 links listed. ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): 168.63.129.16 This appears to be specifically linked to the upgrade, i.e. we were able to provoke the issue by upgrading the systemd package, so I suspect it's part of the packaging in the upgrade process. --- ProblemType: Bug ApportVersion: 2.20.11-0ubuntu27.10 Architecture: amd64 CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 Lspci-vt:  -[0000:00]-+-00.0 Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (AGP disabled)             +-07.0 Intel Corporation 82371AB/EB/MB PIIX4 ISA             +-07.1 Intel Corporation 82371AB/EB/MB PIIX4 IDE             +-07.3 Intel Corporation 82371AB/EB/MB PIIX4 ACPI             \-08.0 Microsoft Corporation Hyper-V virtual VGA Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: systemd 245.4-4ubuntu3.3 PackageArchitecture: amd64 ProcEnviron:  TERM=xterm-256color  PATH=(custom, no user)  LANG=C.UTF-8  SHELL=/bin/bash ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1031-azure root=PARTUUID=2e08bba3-68b4-4a16-af3b-47b73bd138a9 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 ProcVersionSignature: Ubuntu 5.4.0-1031.32-azure 5.4.65 Tags: focal uec-images Uname: Linux 5.4.0-1031-azure x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 12/07/2018 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: 090008 dmi.board.name: Virtual Machine dmi.board.vendor: Microsoft Corporation dmi.board.version: 7.0 dmi.chassis.asset.tag: 7783-7084-3265-9085-8269-3286-77 dmi.chassis.type: 3 dmi.chassis.vendor: Microsoft Corporation dmi.chassis.version: 7.0 dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr090008:bd12/07/2018:svnMicrosoftCorporation:pnVirtualMachine:pvr7.0:rvnMicrosoftCorporation:rnVirtualMachine:rvr7.0:cvnMicrosoftCorporation:ct3:cvr7.0: dmi.product.name: Virtual Machine dmi.product.uuid: 4412ad79-83fa-f845-b7c2-6f30dd4f1950 dmi.product.version: 7.0 dmi.sys.vendor: Microsoft Corporation
2020-11-04 23:29:38 Benjamin Allot bug added subscriber Canonical IS Incidents
2020-11-05 03:01:08 Launchpad Janitor systemd (Ubuntu): status New Confirmed
2020-11-05 03:28:54 Haw Loeung attachment added syslog https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1902960/+attachment/5431334/+files/syslog
2020-11-05 08:30:58 Haw Loeung bug added subscriber Haw Loeung
2020-11-06 21:39:05 Dan Streetman bug task added cloud-init (Ubuntu)
2020-11-06 21:39:13 Dan Streetman systemd (Ubuntu): status Confirmed Invalid
2020-11-07 05:48:39 Launchpad Janitor cloud-init (Ubuntu): status New Confirmed
2020-11-09 15:01:04 Dan Watkins cloud-init (Ubuntu): status Confirmed Incomplete
2020-11-09 15:26:37 David Lawson attachment added Cloud init logs from apt-stresstest/0 in westus https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1902960/+attachment/5432662/+files/cloud-init.tar.gz
2020-11-09 15:26:48 David Lawson cloud-init (Ubuntu): status Incomplete New
2020-11-09 19:59:45 Dan Watkins cloud-init (Ubuntu): status New Incomplete
2020-11-09 19:59:48 Dan Watkins systemd (Ubuntu): status Invalid New
2020-11-09 20:56:13 Dan Streetman bug added subscriber Dan Streetman
2020-11-10 18:05:37 Dan Watkins cloud-init (Ubuntu): status Incomplete New
2020-11-10 18:05:41 Dan Watkins systemd (Ubuntu): status New Incomplete
2020-11-10 19:26:45 Dan Watkins bug task added cloud-images
2020-11-10 20:20:30 Jean-Francois Simoneau bug added subscriber Jean-Francois Simoneau
2020-11-16 09:40:13 Launchpad Janitor cloud-init (Ubuntu): status New Confirmed
2021-01-04 19:23:02 Dan Streetman bug watch added https://github.com/systemd/systemd/issues/17532
2021-01-04 19:23:02 Dan Streetman bug task added systemd
2021-01-04 19:43:18 Dan Streetman description The systemd upgrade 245.4-4ubuntu3.3 to 245.4-4ubuntu3.2 appears to have broken DNS resolution across much of our Azure fleet earlier today. We ended up mitigating this by forcing reboots on the associated instances, no combination of networkctl reload, reconfigure, systemctl daemon-reexec, systemctl daemon-reload, netplan generate, netplan apply would get resolvectl to have a DNS server again. The main symptom appears to have been systemd-networkd believing it wasn't managing the eth0 interfaces: ubuntu@machine-1:~$ sudo networkctl IDX LINK TYPE OPERATIONAL SETUP   1 lo loopback carrier unmanaged   2 eth0 ether routable unmanaged                                                                           2 links listed. Which eventually made them lose their DNS resolvers: ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): After rebooting, we see this behaving properly: ubuntu@machine-1:~$ sudo networkctl list IDX LINK TYPE OPERATIONAL SETUP   1 lo loopback carrier unmanaged   2 eth0 ether routable configured 2 links listed. ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): 168.63.129.16 This appears to be specifically linked to the upgrade, i.e. we were able to provoke the issue by upgrading the systemd package, so I suspect it's part of the packaging in the upgrade process. --- ProblemType: Bug ApportVersion: 2.20.11-0ubuntu27.10 Architecture: amd64 CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 Lspci-vt:  -[0000:00]-+-00.0 Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (AGP disabled)             +-07.0 Intel Corporation 82371AB/EB/MB PIIX4 ISA             +-07.1 Intel Corporation 82371AB/EB/MB PIIX4 IDE             +-07.3 Intel Corporation 82371AB/EB/MB PIIX4 ACPI             \-08.0 Microsoft Corporation Hyper-V virtual VGA Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: systemd 245.4-4ubuntu3.3 PackageArchitecture: amd64 ProcEnviron:  TERM=xterm-256color  PATH=(custom, no user)  LANG=C.UTF-8  SHELL=/bin/bash ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1031-azure root=PARTUUID=2e08bba3-68b4-4a16-af3b-47b73bd138a9 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 ProcVersionSignature: Ubuntu 5.4.0-1031.32-azure 5.4.65 Tags: focal uec-images Uname: Linux 5.4.0-1031-azure x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 12/07/2018 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: 090008 dmi.board.name: Virtual Machine dmi.board.vendor: Microsoft Corporation dmi.board.version: 7.0 dmi.chassis.asset.tag: 7783-7084-3265-9085-8269-3286-77 dmi.chassis.type: 3 dmi.chassis.vendor: Microsoft Corporation dmi.chassis.version: 7.0 dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr090008:bd12/07/2018:svnMicrosoftCorporation:pnVirtualMachine:pvr7.0:rvnMicrosoftCorporation:rnVirtualMachine:rvr7.0:cvnMicrosoftCorporation:ct3:cvr7.0: dmi.product.name: Virtual Machine dmi.product.uuid: 4412ad79-83fa-f845-b7c2-6f30dd4f1950 dmi.product.version: 7.0 dmi.sys.vendor: Microsoft Corporation [impact] on boot of a specific azure instance, the ID_NET_DRIVER parameter of the instance's eth0 interface is not set. That leads to a failure of systemd-networkd to take control of the interface after a restart of systemd-networkd, which results in DNS failures (at first) and eventually complete loss of networking (once the DHCP lease expires). [test case] this occurs on first boot of an instance using the specific image; it is not reproducable using the latest ubuntu image nor any reboot of the affected image, and it has not been reproducable (for me) when using debug-enabled images based on the affected image. So, while the problem is reproducable using the specific image in question, it's not possible to verify the fix since any change to the image removes reproducability. [regression potential] any regression would likely involve problems with systemd-udevd processing 'change' events from network devices, and/or incorrect udevd device properties. [scope] this is needed only for focal and groovy. this is fixed by upstream commit e0e789c1e97 which is first included in v247, so this is fixed already in hirsute. while this commit is not included in bionic, due to the difficult nature of reproducing (and verifying) this, and the fact it has only been seen once on a focal image, I don't think it's appropriate to SRU to bionic at this point; possibly it may be appropriate if this is ever reproduced with a bionic image. [other info] note that this bug's subject and description, as well as the upstream systemd bug subject and description, talk about the problem being DNS resolution. However that is strictly a side-effect of the real problem and is not the actual issue. [original description] The systemd upgrade 245.4-4ubuntu3.3 to 245.4-4ubuntu3.2 appears to have broken DNS resolution across much of our Azure fleet earlier today. We ended up mitigating this by forcing reboots on the associated instances, no combination of networkctl reload, reconfigure, systemctl daemon-reexec, systemctl daemon-reload, netplan generate, netplan apply would get resolvectl to have a DNS server again. The main symptom appears to have been systemd-networkd believing it wasn't managing the eth0 interfaces: ubuntu@machine-1:~$ sudo networkctl IDX LINK TYPE OPERATIONAL SETUP   1 lo loopback carrier unmanaged   2 eth0 ether routable unmanaged                                                                           2 links listed. Which eventually made them lose their DNS resolvers: ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): After rebooting, we see this behaving properly: ubuntu@machine-1:~$ sudo networkctl list IDX LINK TYPE OPERATIONAL SETUP   1 lo loopback carrier unmanaged   2 eth0 ether routable configured 2 links listed. ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): 168.63.129.16 This appears to be specifically linked to the upgrade, i.e. we were able to provoke the issue by upgrading the systemd package, so I suspect it's part of the packaging in the upgrade process. --- ProblemType: Bug ApportVersion: 2.20.11-0ubuntu27.10 Architecture: amd64 CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 Lspci-vt:  -[0000:00]-+-00.0 Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (AGP disabled)             +-07.0 Intel Corporation 82371AB/EB/MB PIIX4 ISA             +-07.1 Intel Corporation 82371AB/EB/MB PIIX4 IDE             +-07.3 Intel Corporation 82371AB/EB/MB PIIX4 ACPI             \-08.0 Microsoft Corporation Hyper-V virtual VGA Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: systemd 245.4-4ubuntu3.3 PackageArchitecture: amd64 ProcEnviron:  TERM=xterm-256color  PATH=(custom, no user)  LANG=C.UTF-8  SHELL=/bin/bash ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1031-azure root=PARTUUID=2e08bba3-68b4-4a16-af3b-47b73bd138a9 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 ProcVersionSignature: Ubuntu 5.4.0-1031.32-azure 5.4.65 Tags: focal uec-images Uname: Linux 5.4.0-1031-azure x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 12/07/2018 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: 090008 dmi.board.name: Virtual Machine dmi.board.vendor: Microsoft Corporation dmi.board.version: 7.0 dmi.chassis.asset.tag: 7783-7084-3265-9085-8269-3286-77 dmi.chassis.type: 3 dmi.chassis.vendor: Microsoft Corporation dmi.chassis.version: 7.0 dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr090008:bd12/07/2018:svnMicrosoftCorporation:pnVirtualMachine:pvr7.0:rvnMicrosoftCorporation:rnVirtualMachine:rvr7.0:cvnMicrosoftCorporation:ct3:cvr7.0: dmi.product.name: Virtual Machine dmi.product.uuid: 4412ad79-83fa-f845-b7c2-6f30dd4f1950 dmi.product.version: 7.0 dmi.sys.vendor: Microsoft Corporation
2021-01-04 19:43:37 Dan Streetman systemd (Ubuntu): status Incomplete Opinion
2021-01-04 19:43:39 Dan Streetman systemd (Ubuntu): status Opinion New
2021-01-04 19:43:50 Dan Streetman nominated for series Ubuntu Focal
2021-01-04 19:43:50 Dan Streetman bug task added cloud-init (Ubuntu Focal)
2021-01-04 19:43:50 Dan Streetman bug task added systemd (Ubuntu Focal)
2021-01-04 19:43:50 Dan Streetman nominated for series Ubuntu Groovy
2021-01-04 19:43:50 Dan Streetman bug task added cloud-init (Ubuntu Groovy)
2021-01-04 19:43:50 Dan Streetman bug task added systemd (Ubuntu Groovy)
2021-01-04 19:44:01 Dan Streetman systemd (Ubuntu): status New Fix Released
2021-01-04 19:44:04 Dan Streetman systemd (Ubuntu Focal): assignee Dan Streetman (ddstreet)
2021-01-04 19:44:08 Dan Streetman systemd (Ubuntu Groovy): importance Undecided Medium
2021-01-04 19:44:10 Dan Streetman systemd (Ubuntu Groovy): assignee Dan Streetman (ddstreet)
2021-01-04 19:44:12 Dan Streetman systemd (Ubuntu Focal): status New In Progress
2021-01-04 19:44:14 Dan Streetman systemd (Ubuntu Focal): importance Undecided Medium
2021-01-04 19:44:17 Dan Streetman systemd (Ubuntu Groovy): status New In Progress
2021-01-05 03:16:16 Bug Watch Updater systemd: status Unknown New
2021-01-05 15:15:37 Dan Watkins cloud-init (Ubuntu): status Confirmed Incomplete
2021-01-05 15:15:39 Dan Watkins cloud-init (Ubuntu Focal): status New Incomplete
2021-01-05 15:15:41 Dan Watkins cloud-init (Ubuntu Groovy): status New Incomplete
2021-01-11 22:05:21 Chris Halse Rogers systemd (Ubuntu Focal): status In Progress Fix Committed
2021-01-11 22:05:27 Chris Halse Rogers bug added subscriber Ubuntu Stable Release Updates Team
2021-01-11 22:05:30 Chris Halse Rogers bug added subscriber SRU Verification
2021-01-11 22:05:44 Chris Halse Rogers tags apport-collected focal uec-images apport-collected focal uec-images verification-needed verification-needed-focal
2021-01-11 22:10:12 Dan Streetman description [impact] on boot of a specific azure instance, the ID_NET_DRIVER parameter of the instance's eth0 interface is not set. That leads to a failure of systemd-networkd to take control of the interface after a restart of systemd-networkd, which results in DNS failures (at first) and eventually complete loss of networking (once the DHCP lease expires). [test case] this occurs on first boot of an instance using the specific image; it is not reproducable using the latest ubuntu image nor any reboot of the affected image, and it has not been reproducable (for me) when using debug-enabled images based on the affected image. So, while the problem is reproducable using the specific image in question, it's not possible to verify the fix since any change to the image removes reproducability. [regression potential] any regression would likely involve problems with systemd-udevd processing 'change' events from network devices, and/or incorrect udevd device properties. [scope] this is needed only for focal and groovy. this is fixed by upstream commit e0e789c1e97 which is first included in v247, so this is fixed already in hirsute. while this commit is not included in bionic, due to the difficult nature of reproducing (and verifying) this, and the fact it has only been seen once on a focal image, I don't think it's appropriate to SRU to bionic at this point; possibly it may be appropriate if this is ever reproduced with a bionic image. [other info] note that this bug's subject and description, as well as the upstream systemd bug subject and description, talk about the problem being DNS resolution. However that is strictly a side-effect of the real problem and is not the actual issue. [original description] The systemd upgrade 245.4-4ubuntu3.3 to 245.4-4ubuntu3.2 appears to have broken DNS resolution across much of our Azure fleet earlier today. We ended up mitigating this by forcing reboots on the associated instances, no combination of networkctl reload, reconfigure, systemctl daemon-reexec, systemctl daemon-reload, netplan generate, netplan apply would get resolvectl to have a DNS server again. The main symptom appears to have been systemd-networkd believing it wasn't managing the eth0 interfaces: ubuntu@machine-1:~$ sudo networkctl IDX LINK TYPE OPERATIONAL SETUP   1 lo loopback carrier unmanaged   2 eth0 ether routable unmanaged                                                                           2 links listed. Which eventually made them lose their DNS resolvers: ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): After rebooting, we see this behaving properly: ubuntu@machine-1:~$ sudo networkctl list IDX LINK TYPE OPERATIONAL SETUP   1 lo loopback carrier unmanaged   2 eth0 ether routable configured 2 links listed. ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): 168.63.129.16 This appears to be specifically linked to the upgrade, i.e. we were able to provoke the issue by upgrading the systemd package, so I suspect it's part of the packaging in the upgrade process. --- ProblemType: Bug ApportVersion: 2.20.11-0ubuntu27.10 Architecture: amd64 CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 Lspci-vt:  -[0000:00]-+-00.0 Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (AGP disabled)             +-07.0 Intel Corporation 82371AB/EB/MB PIIX4 ISA             +-07.1 Intel Corporation 82371AB/EB/MB PIIX4 IDE             +-07.3 Intel Corporation 82371AB/EB/MB PIIX4 ACPI             \-08.0 Microsoft Corporation Hyper-V virtual VGA Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: systemd 245.4-4ubuntu3.3 PackageArchitecture: amd64 ProcEnviron:  TERM=xterm-256color  PATH=(custom, no user)  LANG=C.UTF-8  SHELL=/bin/bash ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1031-azure root=PARTUUID=2e08bba3-68b4-4a16-af3b-47b73bd138a9 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 ProcVersionSignature: Ubuntu 5.4.0-1031.32-azure 5.4.65 Tags: focal uec-images Uname: Linux 5.4.0-1031-azure x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 12/07/2018 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: 090008 dmi.board.name: Virtual Machine dmi.board.vendor: Microsoft Corporation dmi.board.version: 7.0 dmi.chassis.asset.tag: 7783-7084-3265-9085-8269-3286-77 dmi.chassis.type: 3 dmi.chassis.vendor: Microsoft Corporation dmi.chassis.version: 7.0 dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr090008:bd12/07/2018:svnMicrosoftCorporation:pnVirtualMachine:pvr7.0:rvnMicrosoftCorporation:rnVirtualMachine:rvr7.0:cvnMicrosoftCorporation:ct3:cvr7.0: dmi.product.name: Virtual Machine dmi.product.uuid: 4412ad79-83fa-f845-b7c2-6f30dd4f1950 dmi.product.version: 7.0 dmi.sys.vendor: Microsoft Corporation [impact] on boot of a specific azure instance, the ID_NET_DRIVER parameter of the instance's eth0 interface is not set. That leads to a failure of systemd-networkd to take control of the interface after a restart of systemd-networkd, which results in DNS failures (at first) and eventually complete loss of networking (once the DHCP lease expires). [test case] this occurs on first boot of an instance using the specific image; it is not reproducable using the latest ubuntu image nor any reboot of the affected image, and it has not been reproducable (for me) when using debug-enabled images based on the affected image. So, while the problem is reproducable using the specific image in question, it's not possible to verify the fix since any change to the image removes reproducability. however, while the problem itself can't be reproduced and then verified, if the assumption is correct (that the 'add' uevent is being missed on boot), that is possible to test and verify: $ udevadm info /sys/class/net/eth0 | grep ID_NET_DRIVER E: ID_NET_DRIVER=hv_netvsc $ sudo rm /run/udev/data/n2 (note, change 'n2' to whichever network interface index is correct) $ udevadm info /sys/class/net/eth0 | grep ID_NET_DRIVER $ sudo udevadm trigger -c change /sys/class/net/eth0 $ udevadm info /sys/class/net/eth0 | grep ID_NET_DRIVER (note the 'change' uevent did not populate ID_NET_DRIVER property) $ sudo udevadm trigger -c add /sys/class/net/eth0 $ udevadm info /sys/class/net/eth0 | grep ID_NET_DRIVER E: ID_NET_DRIVER=hv_netvsc (note the 'add' uevent did populate ID_NET_DRIVER) the test verification should result in ID_NET_DRIVER being populated for a 'change' uevent. [regression potential] any regression would likely involve problems with systemd-udevd processing 'change' events from network devices, and/or incorrect udevd device properties. [scope] this is needed only for focal and groovy. this is fixed by upstream commit e0e789c1e97 which is first included in v247, so this is fixed already in hirsute. while this commit is not included in bionic, due to the difficult nature of reproducing (and verifying) this, and the fact it has only been seen once on a focal image, I don't think it's appropriate to SRU to bionic at this point; possibly it may be appropriate if this is ever reproduced with a bionic image. [other info] note that this bug's subject and description, as well as the upstream systemd bug subject and description, talk about the problem being DNS resolution. However that is strictly a side-effect of the real problem and is not the actual issue. [original description] The systemd upgrade 245.4-4ubuntu3.3 to 245.4-4ubuntu3.2 appears to have broken DNS resolution across much of our Azure fleet earlier today. We ended up mitigating this by forcing reboots on the associated instances, no combination of networkctl reload, reconfigure, systemctl daemon-reexec, systemctl daemon-reload, netplan generate, netplan apply would get resolvectl to have a DNS server again. The main symptom appears to have been systemd-networkd believing it wasn't managing the eth0 interfaces: ubuntu@machine-1:~$ sudo networkctl IDX LINK TYPE OPERATIONAL SETUP   1 lo loopback carrier unmanaged   2 eth0 ether routable unmanaged                                                                           2 links listed. Which eventually made them lose their DNS resolvers: ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): After rebooting, we see this behaving properly: ubuntu@machine-1:~$ sudo networkctl list IDX LINK TYPE OPERATIONAL SETUP   1 lo loopback carrier unmanaged   2 eth0 ether routable configured 2 links listed. ubuntu@machine-1:~$ sudo resolvectl dns Global: Link 2 (eth0): 168.63.129.16 This appears to be specifically linked to the upgrade, i.e. we were able to provoke the issue by upgrading the systemd package, so I suspect it's part of the packaging in the upgrade process. --- ProblemType: Bug ApportVersion: 2.20.11-0ubuntu27.10 Architecture: amd64 CasperMD5CheckResult: skip DistroRelease: Ubuntu 20.04 Lspci-vt:  -[0000:00]-+-00.0 Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (AGP disabled)             +-07.0 Intel Corporation 82371AB/EB/MB PIIX4 ISA             +-07.1 Intel Corporation 82371AB/EB/MB PIIX4 IDE             +-07.3 Intel Corporation 82371AB/EB/MB PIIX4 ACPI             \-08.0 Microsoft Corporation Hyper-V virtual VGA Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Microsoft Corporation Virtual Machine Package: systemd 245.4-4ubuntu3.3 PackageArchitecture: amd64 ProcEnviron:  TERM=xterm-256color  PATH=(custom, no user)  LANG=C.UTF-8  SHELL=/bin/bash ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1031-azure root=PARTUUID=2e08bba3-68b4-4a16-af3b-47b73bd138a9 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 ProcVersionSignature: Ubuntu 5.4.0-1031.32-azure 5.4.65 Tags: focal uec-images Uname: Linux 5.4.0-1031-azure x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 12/07/2018 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: 090008 dmi.board.name: Virtual Machine dmi.board.vendor: Microsoft Corporation dmi.board.version: 7.0 dmi.chassis.asset.tag: 7783-7084-3265-9085-8269-3286-77 dmi.chassis.type: 3 dmi.chassis.vendor: Microsoft Corporation dmi.chassis.version: 7.0 dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr090008:bd12/07/2018:svnMicrosoftCorporation:pnVirtualMachine:pvr7.0:rvnMicrosoftCorporation:rnVirtualMachine:rvr7.0:cvnMicrosoftCorporation:ct3:cvr7.0: dmi.product.name: Virtual Machine dmi.product.uuid: 4412ad79-83fa-f845-b7c2-6f30dd4f1950 dmi.product.version: 7.0 dmi.sys.vendor: Microsoft Corporation
2021-01-11 22:32:03 Chris Halse Rogers systemd (Ubuntu Groovy): status In Progress Fix Committed
2021-01-11 22:32:19 Chris Halse Rogers tags apport-collected focal uec-images verification-needed verification-needed-focal apport-collected focal uec-images verification-needed verification-needed-focal verification-needed-groovy
2021-01-13 18:33:23 Dan Streetman tags apport-collected focal uec-images verification-needed verification-needed-focal verification-needed-groovy apport-collected focal uec-images verification-done verification-done-focal verification-done-groovy
2021-01-18 09:34:34 Launchpad Janitor systemd (Ubuntu Groovy): status Fix Committed Fix Released
2021-01-18 09:34:56 Łukasz Zemczak removed subscriber Ubuntu Stable Release Updates Team
2021-01-18 09:48:38 Launchpad Janitor systemd (Ubuntu Focal): status Fix Committed Fix Released
2021-07-28 23:16:23 Brian Murray cloud-init (Ubuntu Groovy): status Incomplete Won't Fix
2022-09-01 10:56:23 Aaron Whitehouse bug added subscriber Aaron Whitehouse
2023-05-05 07:23:57 James Falcon cloud-init (Ubuntu): status Incomplete Invalid
2023-05-05 07:24:01 James Falcon cloud-init (Ubuntu Focal): status Incomplete Invalid