5.13 (I/J/F-HWE) Hosts crash 5.4 (F) 1st+2nd level KVM guests

Bug #1952246 reported by Christian Ehrhardt 
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Issue - on start the second level Guest enters paused state due to a KVM fail:

$ sudo cat /var/log/libvirt/qemu/focal-2nd-lvm-test.log
...
KVM: entry failed, hardware error 0x7
EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000663
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 00009300
CS =f000 ffff0000 0000ffff 00009b00
SS =0000 00000000 0000ffff 00009300
DS =0000 00000000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT= 00000000 0000ffff
IDT= 00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=04 66 41 eb f1 66 83 c9 ff 66 89 c8 66 5b 66 5e 66 5f 66 c3 <ea> 5b e0 00 f0 30 36 2f 32 33 2f 39 39 00 fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0

Warning, this message is rather generic you will find many very different issues
in the *net - I think the kernel Team has to check commints between the versions
I identified or recreate it (instructions below).

ubuntu@focal-kvm:~$ virsh start focal-2nd-lvm-test
Domain focal-2nd-lvm-test started

ubuntu@focal-kvm:~$ virsh list
 Id Name State
-----------------------------------
 2 focal-2nd-lvm-test paused

OS:
- Jammy Host / Focal 1st level Guest / Focal 2nd level Guest
- Bionic 1st level guest worked fine
- Jammy 1st level Guest worked fine

Guests:
- First level Guest Focal with cpu host-passthrough
- Second level Focal with basic cpu model qemu64
- Otherwise as uvtool-libvirt default provides it

The real CPU is an older:
  Model name: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
    CPU family: 6
    Model: 63

Trying different kernel versions
H 5.13.0-19 G1 generic 5.4.0-90 => Fail
H 5.13.0-19 G1 hwe-20.04-edge 5.13.0.22.22~20.04.9 => Work
H 5.13.0-19 G1 hwe-20.04 5.11.0-41 => Work
H 5.13.0-19 G1 Proposed 5.4.0-91 => Fail
H 5.13.0-19 G1 Release 5.4.0.26.32 => Fail

I redeployed Focal on the Host and checked this as well as a 5.13
hwe-20.04-edge kernel.

Focal Host (qemu/libvirt and such are now much different) now:
(The redeploy also clears any unexpected old config that might cause this)
H 5.4.0-90 G1 generic 5.4.0-90 => works
H 5.4.0-90 G1 generic 5.13.0-21.21 => works
H 5.13.0-21.21 G1 generic 5.4.0-90 => fails
H 5.13.0-21.21 G1 generic 5.13.0-21.21 => works

So it is the same pattern everywhere,
"BareMetal 5.13, first level 5.4 can not start second level guest anymore"

Repro steps:
1. Take a Impish/Jammy Host with 5.13 or Focal with HWE 5.13 kernel
2. create first level Focal (5.4)
  uvt-simplestreams-libvirt --verbose sync --source http://cloud-images.ubuntu.com/daily arch=amd64 label=daily release=focal
  ssh-keygen
  uvt-kvm create --memory 8192 --cpu 4 --host-passthrough --password=ubuntu f1 release=focal arch=amd64 label=daily
3. In there create a second level focal guest
  uvt-simplestreams-libvirt --verbose sync --source http://cloud-images.ubuntu.com/daily arch=amd64 label=daily release=focal
  ssh-keygen
  uvt-kvm create --password=ubuntu f2 release=focal arch=amd64 label=daily

That guest will crash on start then as reported above.

We have cross checked this on a laptop with Jammy, there it worked fine.

Biggest releated difference would be in the used Hardware, therefore I'll attach
details about the chips.

The crashing system is internal, logins/VPNs can be granted if not
reproducible elsehwere.
---
ProblemType: Bug
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Nov 25 14:53 seq
 crw-rw---- 1 root audio 116, 33 Nov 25 14:53 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.11-0ubuntu27.21
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CasperMD5CheckResult: skip
DistroRelease: Ubuntu 20.04
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
MachineType: HP ProLiant DL360 Gen9
Package: linux (not installed)
PciMultimedia:

ProcFB: 0 mgag200drmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.13.0-21-generic root=UUID=c941b173-e6b5-485a-a02b-8d966b8d3c73 ro --- console=ttyS1,115200
ProcVersionSignature: Ubuntu 5.13.0-21.21~20.04.1-generic 5.13.18
RelatedPackageVersions:
 linux-restricted-modules-5.13.0-21-generic N/A
 linux-backports-modules-5.13.0-21-generic N/A
 linux-firmware 1.187.20
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
Tags: focal uec-images
Uname: Linux 5.13.0-21-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
_MarkForUpload: True
dmi.bios.date: 01/22/2018
dmi.bios.release: 2.56
dmi.bios.vendor: HP
dmi.bios.version: P89
dmi.board.name: ProLiant DL360 Gen9
dmi.board.vendor: HP
dmi.chassis.type: 23
dmi.chassis.vendor: HP
dmi.ec.firmware.release: 2.60
dmi.modalias: dmi:bvnHP:bvrP89:bd01/22/2018:br2.56:efr2.60:svnHP:pnProLiantDL360Gen9:pvr:rvnHP:rnProLiantDL360Gen9:rvr:cvnHP:ct23:cvr:sku780018-S01:
dmi.product.family: ProLiant
dmi.product.name: ProLiant DL360 Gen9
dmi.product.sku: 780018-S01
dmi.sys.vendor: HP
---
ProblemType: Bug
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Nov 25 15:17 seq
 crw-rw---- 1 root audio 116, 33 Nov 25 15:17 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.11-0ubuntu27.21
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found.
CasperMD5CheckResult: skip
DistroRelease: Ubuntu 20.04
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Lsusb-t:
 /: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
 /: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 480M
MachineType: QEMU Standard PC (Q35 + ICH9, 2009)
Package: linux (not installed)
PciMultimedia:

ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-90-generic root=UUID=9b09021e-f2c0-4e11-b06d-086220af16c3 ro console=tty1 console=ttyS0
ProcVersionSignature: Ubuntu 5.4.0-90.101-generic 5.4.148
RelatedPackageVersions:
 linux-restricted-modules-5.4.0-90-generic N/A
 linux-backports-modules-5.4.0-90-generic N/A
 linux-firmware 1.187.20
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
Tags: focal uec-images
Uname: Linux 5.4.0-90-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
_MarkForUpload: True
dmi.bios.date: 04/01/2014
dmi.bios.vendor: SeaBIOS
dmi.bios.version: 1.13.0-1ubuntu1.1
dmi.chassis.type: 1
dmi.chassis.vendor: QEMU
dmi.chassis.version: pc-q35-focal
dmi.modalias: dmi:bvnSeaBIOS:bvr1.13.0-1ubuntu1.1:bd04/01/2014:svnQEMU:pnStandardPC(Q35+ICH9,2009):pvrpc-q35-focal:cvnQEMU:ct1:cvrpc-q35-focal:
dmi.product.name: Standard PC (Q35 + ICH9, 2009)
dmi.product.version: pc-q35-focal
dmi.sys.vendor: QEMU

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : CRDA.txt

apport information

tags: added: apport-collected focal uec-images
description: updated
Revision history for this message
Christian Ehrhardt  (paelzer) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Lspci.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Lspci-vt.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Lsusb.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Lsusb-t.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Lsusb-v.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : ProcEnviron.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : ProcModules.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : UdevDb.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : WifiSyslog.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : acpidump.txt

apport information

description: updated
Revision history for this message
Christian Ehrhardt  (paelzer) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Lspci.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Lspci-vt.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Lsusb-v.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : ProcEnviron.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : ProcModules.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : UdevDb.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : WifiSyslog.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : acpidump.txt

apport information

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Now I have completed attaching (apport-collect) data of both:
- Host (hostname: node-horse)
- first level guest (hostname: f)

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Since it seems to depend on the host it likely is the chip, so here some info about that.
This is the bad system.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Since it seems to depend on the host it likely is the chip, so here some info about that.
This is the good system.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Now that all data is in place I set it to confirmed to silence the bots.
@Kernel people - is this known, if so where? If it is not known what else would you need here?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

FYI this was reported earlier as bug 1944104 and still waits kernel team action there.
Since this bug here has more data/details I've marked the other a dup.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.