Activity log for bug #1873809

Date Who What changed Old value New value Message
2020-04-20 13:16:13 Stéphane Graber bug added bug
2020-04-20 14:55:22 Guilherme G. Piccoli bug added subscriber Guilherme G. Piccoli
2020-04-20 15:58:44 Robert C Jennings bug task added linux-kvm (Ubuntu)
2020-04-20 16:12:36 Colin Ian King attachment added config-5.4.0-1008-kvm.xz https://bugs.launchpad.net/cloud-images/+bug/1873809/+attachment/5357161/+files/config-5.4.0-1008-kvm.xz
2020-04-20 18:47:13 Stéphane Graber cloud-images: status New Invalid
2020-04-20 18:48:09 Stéphane Graber summary disk-kvm.img aren't UEFI bootable Make linux-kvm bootable in LXD VMs
2020-04-20 18:57:07 Stéphane Graber description The `disk-kvm.img` images which are to be preferred when run under virtualization, completely fail to boot under UEFI. This is a critical issue as those are the images that LXD is now pulling by default. User report on the LXD side: https://github.com/lxc/lxd/issues/7224 Note that the non optimized images boot just fine (disk1.img). I've reproduced this issue with: - wget http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda focal-server-cloudimg-amd64-disk-kvm.img -m 1G On the graphical console, you'll see EDK2 load (TianoCore) followed by basic boot messages and then a message from grub (error: can't find command `hwmatch`). Those also appear on successful boots of other images so I don't think there's anything concerning that. However it'll hang indefinitely and eat up all your CPU. Switching to the text console view (serial0), you'll see the same issue as that LXD report: BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM00003 " from PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM00001 " from PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0) BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM00001 " from PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0) error: can't find command `hwmatch'. e!!!! X64 Exception Type - 0D(#GP - General Protection) CPU Apic ID - 00000000 !!!! ExceptionData - 0000000000000000 RIP - 000000003FF2DA12, CS - 0000000000000038, RFLAGS - 0000000000200202 RAX - AFAFAFAFAFAFAFAF, RCX - 000000003E80F108, RDX - AFAFAFAFAFAFAFAF RBX - 0000000000000398, RSP - 000000003FF1C638, RBP - 000000003FF34360 RSI - 000000003FF343B8, RDI - 0000000000001000 R8 - 000000003E80F108, R9 - 000000003E815B98, R10 - 0000000000000065 R11 - 0000000000002501, R12 - 0000000000000004, R13 - 000000003E80F100 R14 - 0000000000000000, R15 - 0000000000000000 DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030 GS - 0000000000000030, SS - 0000000000000030 CR0 - 0000000080010033, CR2 - 0000000000000000, CR3 - 000000003FC01000 CR4 - 0000000000000668, CR8 - 0000000000000000 DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000 DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400 GDTR - 000000003FBEEA98 0000000000000047, LDTR - 0000000000000000 IDTR - 000000003F2D8018 0000000000000FFF, TR - 0000000000000000 FXSAVE_STATE - 000000003FF1C290 !!!! Find image based on IP(0x3FF2DA12) /build/edk2-dQLD17/edk2-0~20191122.bd85bf54/Build/OvmfX64/RELEASE_GCC5/X64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll (ImageBase=000000003FF1E000, EntryPoint=000000003FF30781) !!!! If booting in a SecureBoot enabled environment, you instead get a `Access Denied` at kernel loading time, indicating that the kernel binary isn't a normal signed kernel. That has the same result (boot hangs) but without the crash message. The `disk-kvm.img` images which are to be preferred when run under virtualization, currently completely fail to boot under UEFI. A workaround was put in place such that LXD instead will pull generic-based images until this is resolved, this however does come with a much longer boot time (as the kernel panics, reboots and then boots) and also reduced functionality from cloud-init, so we'd still like this fixed in the near future. To get things behaving, it looks like we need the following config options to be enable in linux-kvm: - CONFIG_EFI_STUB - CONFIG_VSOCKETS - CONFIG_VIRTIO_VSOCKETS - CONFIG_VIRTIO_VSOCKETS_COMMON == Rationale == We'd like to be able to use the linux-kvm based images for LXD, those will directly boot without needing the panic+reboot behavior of generic images and will be much lighter in general. We also need the LXD agent to work, which requires functional virtio vsock. == Test case == - wget http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-lxd.tar.xz - wget http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img - lxc image import focal-server-cloudimg-amd64-lxd.tar.xz focal-server-cloudimg-amd64-disk-kvm.img --alias bug1873809 - lxc launch bug1873809 v1 - lxc console v1 - <check that it boots to login prompt> - <disconnect with ctrl+a-q> - lxc exec v1 bash To validate a new kernel, you'll need to manually repack the .img file and install the new kernel in there. == Regression potential == I don't know who else is using those kvm images right now, but those changes will cause a change to the kernel binary such that it contains the EFI stub bits + a signature. This could cause some (horribly broken) systems to no longer be able to boot that kernel. Though considering that such a setup is common to our other kernels, this seems unlikely. Also, this will be introducing virtio vsock support which again, could maybe confused some horribly broken systems? In either case, the kernel conveniently is the only package which ships multiple versions concurently, so rebooting on the previous kernel is always an option, mitigating some of the risks. -- Details from original report -- User report on the LXD side: https://github.com/lxc/lxd/issues/7224 I've reproduced this issue with:  - wget http://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64-disk-kvm.img  - qemu-system-x86_64 -bios /usr/share/ovmf/OVMF.fd -hda focal-server-cloudimg-amd64-disk-kvm.img -m 1G On the graphical console, you'll see EDK2 load (TianoCore) followed by basic boot messages and then a message from grub (error: can't find command `hwmatch`). Those also appear on successful boots of other images so I don't think there's anything concerning that. However it'll hang indefinitely and eat up all your CPU. Switching to the text console view (serial0), you'll see the same issue as that LXD report: BdsDxe: failed to load Boot0001 "UEFI QEMU DVD-ROM QM00003 " from PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Master,0x0): Not Found BdsDxe: loading Boot0002 "UEFI QEMU HARDDISK QM00001 " from PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0) BdsDxe: starting Boot0002 "UEFI QEMU HARDDISK QM00001 " from PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0) error: can't find command `hwmatch'. e!!!! X64 Exception Type - 0D(#GP - General Protection) CPU Apic ID - 00000000 !!!! ExceptionData - 0000000000000000 RIP - 000000003FF2DA12, CS - 0000000000000038, RFLAGS - 0000000000200202 RAX - AFAFAFAFAFAFAFAF, RCX - 000000003E80F108, RDX - AFAFAFAFAFAFAFAF RBX - 0000000000000398, RSP - 000000003FF1C638, RBP - 000000003FF34360 RSI - 000000003FF343B8, RDI - 0000000000001000 R8 - 000000003E80F108, R9 - 000000003E815B98, R10 - 0000000000000065 R11 - 0000000000002501, R12 - 0000000000000004, R13 - 000000003E80F100 R14 - 0000000000000000, R15 - 0000000000000000 DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030 GS - 0000000000000030, SS - 0000000000000030 CR0 - 0000000080010033, CR2 - 0000000000000000, CR3 - 000000003FC01000 CR4 - 0000000000000668, CR8 - 0000000000000000 DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000 DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400 GDTR - 000000003FBEEA98 0000000000000047, LDTR - 0000000000000000 IDTR - 000000003F2D8018 0000000000000FFF, TR - 0000000000000000 FXSAVE_STATE - 000000003FF1C290 !!!! Find image based on IP(0x3FF2DA12) /build/edk2-dQLD17/edk2-0~20191122.bd85bf54/Build/OvmfX64/RELEASE_GCC5/X64/MdeModulePkg/Core/Dxe/DxeMain/DEBUG/DxeCore.dll (ImageBase=000000003FF1E000, EntryPoint=000000003FF30781) !!!! If booting in a SecureBoot enabled environment, you instead get a `Access Denied` at kernel loading time, indicating that the kernel binary isn't a normal signed kernel. That has the same result (boot hangs) but without the crash message.
2020-04-20 22:34:25 Colin Ian King linux-kvm (Ubuntu): assignee Colin Ian King (colin-king)
2020-04-20 22:34:28 Colin Ian King linux-kvm (Ubuntu): importance Undecided High
2020-04-20 23:42:57 Roufique Hossain cloud-images: status Invalid Confirmed
2020-04-20 23:43:26 Roufique Hossain cloud-images: assignee Roufique Hossain (roufique)
2020-04-20 23:43:33 Roufique Hossain linux-kvm (Ubuntu): assignee Colin Ian King (colin-king) Roufique Hossain (roufique)
2020-04-20 23:43:46 Roufique Hossain linux-kvm (Ubuntu): status New Incomplete
2020-04-20 23:43:52 Roufique Hossain linux-kvm (Ubuntu): status Incomplete Confirmed
2020-04-20 23:46:20 Roufique Hossain bug watch added mailto:roufique@rtat.net
2020-04-20 23:46:20 Roufique Hossain bug task added cloud-bl-tutorials
2020-04-20 23:46:37 Roufique Hossain cloud-bl-tutorials: status New Confirmed
2020-04-20 23:47:18 Roufique Hossain bug added subscriber Roufique Hossain
2020-04-21 14:43:00 Launchpad Janitor linux-kvm (Ubuntu): status Confirmed Fix Released
2020-04-21 22:09:17 Stéphane Graber linux-kvm (Ubuntu): status Fix Released Triaged
2020-04-21 22:09:38 Stéphane Graber cloud-images: assignee Roufique Hossain (roufique)
2020-04-21 22:09:43 Stéphane Graber linux-kvm (Ubuntu): assignee Roufique Hossain (roufique)
2020-04-21 22:09:59 Stéphane Graber cloud-bl-tutorials: status Confirmed Invalid
2020-04-21 22:10:21 Stéphane Graber cloud-bl-tutorials: status Invalid New
2020-04-21 22:10:21 Stéphane Graber cloud-bl-tutorials: remote watch Email to roufique@rtat #
2020-04-21 22:10:39 Stéphane Graber affects cloud-bl-tutorials linux (Ubuntu)
2020-04-21 22:10:49 Stéphane Graber bug task deleted linux (Ubuntu)
2020-04-21 22:11:09 Stéphane Graber cloud-images: status Confirmed Invalid
2020-05-19 21:42:20 Launchpad Janitor linux-kvm (Ubuntu): status Triaged Fix Released
2020-05-19 21:42:20 Launchpad Janitor cve linked 2020-11494
2020-05-19 21:42:20 Launchpad Janitor cve linked 2020-11608
2020-05-19 21:42:20 Launchpad Janitor cve linked 2020-11884
2020-05-26 19:00:27 Stéphane Graber linux-kvm (Ubuntu): status Fix Released Triaged
2020-05-29 19:40:54 Stéphane Graber bug added subscriber Ubuntu containers team
2020-06-15 12:36:01 Stefan Bader nominated for series Ubuntu Focal
2020-06-15 12:36:01 Stefan Bader bug task added linux-kvm (Ubuntu Focal)
2020-06-15 12:36:15 Stefan Bader linux-kvm (Ubuntu Focal): importance Undecided High
2020-06-15 12:36:15 Stefan Bader linux-kvm (Ubuntu Focal): status New In Progress
2020-06-15 12:36:15 Stefan Bader linux-kvm (Ubuntu Focal): assignee Stefan Bader (smb)
2020-06-15 12:59:29 Stefan Bader linux-kvm (Ubuntu Focal): status In Progress Fix Committed
2020-06-18 09:13:53 Stefan Bader tags verification-needed-focal
2020-06-18 14:14:54 Stéphane Graber tags verification-needed-focal verification-failed-focal
2020-06-26 13:31:17 Stefan Bader tags verification-failed-focal verification-needed-focal
2020-06-28 03:19:43 Stéphane Graber tags verification-needed-focal verification-done-focal
2020-07-01 10:26:35 Launchpad Janitor linux-kvm (Ubuntu Focal): status Fix Committed Fix Released
2020-07-01 10:26:35 Launchpad Janitor cve linked 2020-0543
2020-07-01 10:26:35 Launchpad Janitor cve linked 2020-13143
2020-08-24 14:42:55 Michał Sawicz bug added subscriber Michał Sawicz
2020-08-24 14:46:16 Christopher Townsend bug added subscriber Christopher Townsend
2020-08-25 08:37:38 Launchpad Janitor linux-kvm (Ubuntu): status Triaged Fix Released
2020-08-25 08:37:38 Launchpad Janitor cve linked 2019-16089
2020-08-25 08:37:38 Launchpad Janitor cve linked 2019-19642
2020-08-25 08:37:38 Launchpad Janitor cve linked 2020-11935