svm from ubuntu_kvm_unit_tests interrupt with "Unhandled exception 13 #GP at ip 00000000004027e3" on F-intel-5.13

Bug #1943841 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
New
Undecided
Unassigned
linux-intel-5.13 (Ubuntu)
New
Undecided
Unassigned

Bug Description

Failing with Focal Intel 5.13.0-1004.4 on node gonzo

A bit like bug 1934939, but this time it looks like this has passed through more cases than that bug.

Running '/home/ubuntu/autotest/client/tmp/ubuntu_kvm_unit_tests/src/kvm-unit-tests/tests/svm'
 BUILD_HEAD=1593e88a
 timeout -k 1s --foreground 90s /usr/bin/qemu-system-x86_64 --no-reboot -nodefaults -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -machine accel=kvm -kernel /tmp/tmp.8QIXdsXIyJ -smp 2 -cpu max,+svm -m 4g # -initrd /tmp/tmp.VvuuwXHRuX
 enabling apic
 enabling apic
 paging enabled
 cr0 = 80010011
 cr3 = 10bf000
 cr4 = 20
 NPT detected - running all tests with NPT enabled
 PASS: null
 PASS: vmrun
 PASS: ioio
 PASS: vmrun intercept check
 PASS: rsm
 PASS: cr3 read intercept
 PASS: cr3 read nointercept
 PASS: cr3 read intercept emulate
 PASS: dr intercept check
 PASS: next_rip
 PASS: msr intercept check
 PASS: mode_switch
 PASS: asid_zero
 PASS: sel_cr0_bug
 PASS: npt_nx
 PASS: npt_np
 PASS: npt_us
 PASS: npt_rw
 PASS: npt_rw_pfwalk
 PASS: npt_l1mmio
 PASS: npt_rw_l1mmio
 PASS: tsc_adjust
     Latency VMRUN : max: 322434 min: 21233 avg: 36460
     Latency VMEXIT: max: 298726 min: 16889 avg: 17717
 PASS: latency_run_exit
     Latency VMRUN : max: 334040 min: 23446 avg: 36313
     Latency VMEXIT: max: 310725 min: 16935 avg: 17618
 PASS: latency_run_exit_clean
     Latency VMLOAD: max: 732466 min: 4675 avg: 4879
     Latency VMSAVE: max: 60901 min: 4565 avg: 4850
     Latency STGI: max: 43057 min: 3726 avg: 3862
     Latency CLGI: max: 723675 min: 3644 avg: 3724
 PASS: latency_svm_insn
 PASS: exception with vector 2 not injected
 PASS: divide overflow exception injected
 PASS: eventinj.VALID cleared
 PASS: exc_inject
 PASS: pending_event
 PASS: pending_event_cli
 PASS: direct interrupt while running guest
 PASS: intercepted interrupt while running guest
 PASS: direct interrupt + hlt
 PASS: intercepted interrupt + hlt
 PASS: interrupt
 PASS: direct NMI while running guest
 PASS: NMI intercept while running guest
 PASS: nmi
 PASS: direct NMI + hlt
 PASS: NMI intercept while running guest
 PASS: intercepted NMI + hlt
 PASS: nmi_hlt
 PASS: virq_inject
 PASS: No RIP corruption detected after 10000 timer interrupts
 PASS: reg_corruption
 enabling apic
 PASS: svm_init_startup_test
 PASS: host_rflags
 PASS: CPUID.01H:ECX.XSAVE set before VMRUN
 PASS: svm_cr4_osxsave_test_guest finished with VMMCALL
 PASS: CPUID.01H:ECX.XSAVE set after VMRUN
 PASS: EFER.SVME: 1500
 PASS: EFER.SVME: 500
 PASS: Test EFER 9:8: 1700
 PASS: Test EFER 63:16: 11500
 PASS: Test EFER 63:16: 101500
 PASS: Test EFER 63:16: 1001500
 PASS: Test EFER 63:16: 10001500
 PASS: Test EFER 63:16: 100001500
 PASS: Test EFER 63:16: 1000001500
 PASS: Test EFER 63:16: 10000001500
 PASS: Test EFER 63:16: 100000001500
 PASS: Test EFER 63:16: 1000000001500
 PASS: Test EFER 63:16: 10000000001500
 PASS: Test EFER 63:16: 100000000001500
 PASS: Test EFER 63:16: 1000000000001500
 PASS: EFER.LME=1 (1500), CR0.PG=1 (80010011) and CR4.PAE=0 (40000)
 PASS: EFER.LME=1 (1500), CR0.PG=1 and CR0.PE=0 (80010010)
 PASS: EFER.LME=1 (1500), CR0.PG=1 (80010011), CR4.PAE=1 (40020), CS.L=1 and CS.D=1 (699)
 PASS: Test CR0 CD=1,NW=0: c0010011
 PASS: Test CR0 CD=1,NW=1: e0010011
 PASS: Test CR0 CD=0,NW=0: 80010011
 PASS: Test CR0 CD=0,NW=1: a0010011
 PASS: Test CR0 63:32: 180010011
 PASS: Test CR0 63:32: 1080010011
 PASS: Test CR0 63:32: 10080010011
 PASS: Test CR0 63:32: 100080010011
 PASS: Test CR0 63:32: 1000080010011
 PASS: Test CR0 63:32: 10000080010011
 PASS: Test CR0 63:32: 100000080010011
 PASS: Test CR0 63:32: 1000000080010011
 PASS: Test CR3 63:0: 100000010bf000, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR3 63:0: 200000010bf000, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR3 63:0: 400000010bf000, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR3 63:0: 800000010bf000, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR3 63:0: 1000000010bf000, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR3 63:0: 2000000010bf000, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR3 63:0: 4000000010bf000, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR3 63:0: 8000000010bf000, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR3 63:0: 10000000010bf000, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR3 63:0: 20000000010bf000, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR3 63:0: 40000000010bf000, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR3 63:0: 80000000010bf000, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR3 63:0: 10bf000
 PASS: Test CR3 (PCIDE=0) 11:0: 10bf001, wanted exit 0x400, got 0x400
 PASS: Test CR3 (PCIDE=0) 11:0: 10bf002, wanted exit 0x400, got 0x400
 PASS: Test CR3 (PCIDE=0) 11:0: 10bf004, wanted exit 0x400, got 0x400
 PASS: Test CR3 (PCIDE=0) 11:0: 10bf020, wanted exit 0x400, got 0x400
 PASS: Test CR3 (PCIDE=0) 11:0: 10bf040, wanted exit 0x400, got 0x400
 PASS: Test CR3 (PCIDE=0) 11:0: 10bf080, wanted exit 0x400, got 0x400
 PASS: Test CR3 (PCIDE=0) 11:0: 10bf100, wanted exit 0x400, got 0x400
 PASS: Test CR3 (PCIDE=0) 11:0: 10bf200, wanted exit 0x400, got 0x400
 PASS: Test CR3 (PCIDE=0) 11:0: 10bf400, wanted exit 0x400, got 0x400
 PASS: Test CR3 (PCIDE=0) 11:0: 10bf800, wanted exit 0x400, got 0x400
 PASS: Test CR3 (PAE) 2:0: 10bf001, wanted exit 0x400, got 0x400
 PASS: Test CR3 (PAE) 2:0: 10bf002, wanted exit 0x400, got 0x400
 PASS: Test CR3 (PAE) 2:0: 10bf004, wanted exit 0x400, got 0x400
 PASS: Test CR4 31:12: 41020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 42020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 44020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 48020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: c0020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 840020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 1040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 2040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 4040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 8040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 10040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 20040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 40040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 80040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 41020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 42020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 44020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 48020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: c0020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 840020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 1040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 2040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 4040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 8040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 10040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 20040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 40040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 31:12: 80040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 63:32: 100040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 63:32: 1000040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 63:32: 10000040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 63:32: 100000040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 63:32: 1000000040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 63:32: 10000000040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 63:32: 100000000040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test CR4 63:32: 1000000000040020, wanted exit 0xffffffff, got 0xffffffff
 PASS: Test DR6 63:32: 1ffff0ff0
 PASS: Test DR6 63:32: 10ffff0ff0
 PASS: Test DR6 63:32: 100ffff0ff0
 PASS: Test DR6 63:32: 1000ffff0ff0
 PASS: Test DR6 63:32: 10000ffff0ff0
 PASS: Test DR6 63:32: 100000ffff0ff0
 PASS: Test DR6 63:32: 1000000ffff0ff0
 PASS: Test DR6 63:32: 10000000ffff0ff0
 PASS: Test DR7 63:32: 100000400
 PASS: Test DR7 63:32: 1000000400
 PASS: Test DR7 63:32: 10000000400
 PASS: Test DR7 63:32: 100000000400
 PASS: Test DR7 63:32: 1000000000400
 PASS: Test DR7 63:32: 10000000000400
 PASS: Test DR7 63:32: 100000000000400
 PASS: Test DR7 63:32: 1000000000000400
 PASS: Test MSRPM address: ffffffe000
 PASS: Test MSRPM address: ffffffe001
 PASS: Test MSRPM address: fffffff000
 PASS: Test MSRPM address: 435000
 PASS: Test MSRPM address: 435fff
 PASS: Test IOPM address: ffffffc000
 PASS: Test IOPM address: ffffffd000
 PASS: Test IOPM address: ffffffdffe
 PASS: Test IOPM address: ffffffe000
 PASS: Test IOPM address: fffffff000
 PASS: Test IOPM address: 438000
 PASS: Test IOPM address: 438fff
 PASS: Test FS.base for canonical form: 0
 PASS: Test GS.base for canonical form: 53ddb0
 PASS: Test LDTR.base for canonical form: 0
 PASS: Test TR.base for canonical form: 41407a
 PASS: Test KERNEL GS.base for canonical form: 0
 PASS: Successful VMRUN with noncanonical ES.base
 PASS: Successful VMRUN with noncanonical CS.base
 PASS: Successful VMRUN with noncanonical SS.base
 PASS: Successful VMRUN with noncanonical DS.base
 PASS: Successful VMRUN with noncanonical GDTR.base
 PASS: Successful VMRUN with noncanonical IDTR.base
 PASS: Wanted #NPF on rsvd bits = 0x8000000000000000, got exit = 0x400
 PASS: Wanted PFEC = 0x10000000d, got PFEC = 10000000d, PxE = 0x8000000000401067. host.NX = 0, host.SMEP = 0, guest.NX = 0, guest.SMEP = 0
 PASS: Wanted #NPF on rsvd bits = 0x3600000000000, got exit = 0x400
 PASS: Wanted PFEC = 0x10000001d, got PFEC = 10000001d, PxE = 0x3600000401067. host.NX = 1, host.SMEP = 0, guest.NX = 0, guest.SMEP = 0
 Unhandled exception 13 #GP at ip 00000000004027e3
 error_code=0000 rflags=00010256 cs=00000008
 rax=0000000000001500 rcx=00000000c0000080 rdx=0000000000000000 rbx=0000000000140020
 rbp=000000000053ed00 rsi=000000000000000a rdi=00000000000003fd
  r8=000000000041c83a r9=00000000000003f8 r10=000000000000000d r11=00000000bf9f9000
 r12=0000000000000002 r13=00000000bfec9008 r14=0000000000040020 r15=0000000000001500
 cr0=0000000080010011 cr2=0000000000000000 cr3=00000000010bf000 cr4=0000000000040020
 cr8=0000000000000000
     STACK: @4027e3 402961 400ecc 400368
 FAIL svm

Po-Hsu Lin (cypressyew)
tags: added: 5.13 focal ubuntu-kvm-unit-tests
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

This issue can be reproduced on the same node gonzo with F-intel 5.13.0-1003.3
Therefore this is not a regression.

Revision history for this message
Stefan Bader (smb) wrote :

Also happens with hirsute:linux 5.11.0-38.42 for sru-20210927 on that host (gonzo).

tags: added: 5.11 amd64 hirsute sru-20210927
Revision history for this message
Mehmet Basaran (mehmetbasaran) wrote :
Download full text (19.0 KiB)

Similar error happens with noble:linux-lowlatency 6.8.0-48.48.3 on riccioli (SRU cycle: 2024.09.30)

20:28:53 INFO | START ubuntu_kvm_unit_tests.svm ubuntu_kvm_unit_tests.svm timeout=1800 timestamp=1729283333 localtime=Oct 18 20:28:53
20:28:53 DEBUG| Persistent state client._record_indent now set to 2
20:28:53 DEBUG| Persistent state client.unexpected_reboot now set to ('ubuntu_kvm_unit_tests.svm', 'ubuntu_kvm_unit_tests.svm')
20:28:53 DEBUG| Waiting for pid 18157 for 1800 seconds
20:28:53 WARNI| System python is too old, crash handling disabled
20:28:53 DEBUG| Running 'kvm-ok'
20:28:53 DEBUG| [stdout] INFO: /dev/kvm exists
20:28:53 DEBUG| [stdout] KVM acceleration can be used
20:28:53 DEBUG| Running '/home/ubuntu/autotest/client/tmp/ubuntu_kvm_unit_tests/src/kvm-unit-tests/tests/svm'
20:28:53 DEBUG| [stdout] BUILD_HEAD=b04954c9
20:28:53 DEBUG| [stdout] timeout -k 1s --foreground 90s /usr/bin/qemu-system-x86_64 --no-reboot -nodefaults -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -machine accel=kvm -kernel /tmp/tmp.jgXFu3YdzN -smp 2 -cpu max,+svm -m 4g -append -pause_filter_test # -initrd /tmp/tmp.jZ1NEkg5BI
20:28:54 DEBUG| [stdout] enabling apic
20:28:54 DEBUG| [stdout] smp: waiting for 1 APs
20:28:54 DEBUG| [stdout] enabling apic
20:28:54 DEBUG| [stdout] setup: CPU 1 online
20:28:54 DEBUG| [stdout] paging enabled
20:28:54 DEBUG| [stdout] cr0 = 80010011
20:28:54 DEBUG| [stdout] cr3 = 10bf000
20:28:54 DEBUG| [stdout] cr4 = 20
20:28:54 DEBUG| [stdout] NPT detected - running all tests with NPT enabled
20:28:54 DEBUG| [stdout] PASS: null
20:28:54 DEBUG| [stdout] PASS: vmrun
20:28:54 DEBUG| [stdout] PASS: ioio
20:28:54 DEBUG| [stdout] PASS: vmrun intercept check
20:28:54 DEBUG| [stdout] PASS: rsm
20:28:54 DEBUG| [stdout] PASS: cr3 read intercept
20:28:54 DEBUG| [stdout] PASS: cr3 read nointercept
20:28:54 DEBUG| [stdout] PASS: cr3 read intercept emulate
20:28:54 DEBUG| [stdout] PASS: dr intercept check
20:28:54 DEBUG| [stdout] PASS: next_rip
20:28:54 DEBUG| [stdout] PASS: msr intercept check
20:28:54 DEBUG| [stdout] PASS: mode_switch
20:28:54 DEBUG| [stdout] PASS: asid_zero
20:28:54 DEBUG| [stdout] PASS: sel_cr0_bug
20:28:54 DEBUG| [stdout] PASS: tsc_adjust
20:28:58 DEBUG| [stdout] Latency VMRUN : max: 4793660 min: 6460 avg: 8899
20:28:58 DEBUG| [stdout] Latency VMEXIT: max: 4766600 min: 3580 avg: 3774
20:28:58 DEBUG| [stdout] PASS: latency_run_exit
20:29:03 DEBUG| [stdout] Latency VMRUN : max: 2145480 min: 8480 avg: 8898
20:29:03 DEBUG| [stdout] Latency VMEXIT: max: 2133480 min: 3580 avg: 3788
20:29:03 DEBUG| [stdout] PASS: latency_run_exit_clean
20:29:03 DEBUG| [stdout] Latency VMLOAD: max: 2443180 min: 260 avg: 275
20:29:03 DEBUG| [stdout] Latency VMSAVE: max: 1971420 min: 240 avg: 263
20:29:03 DEBUG| [stdout] Latency STGI: max: 9740 min: 40 avg: 49
20:29:03 DEBUG| [stdout] Latency CLGI: max: 46560 min: 40 avg: 53
20:29:03 DEBUG| [stdout] PASS: latency_svm_insn
20:29:03 DEBUG| [stdout] PASS: exception with vector 2 not injected
20:29:03 DEBUG| [stdout] PASS: divide overflow exception injected
20:29:03 DEBUG| [stdout] PASS:...

Revision history for this message
Mehmet Basaran (mehmetbasaran) wrote (last edit ):

There is a good chance that the error I mentioned, "Unhandled exception 6 #UD" (undefined instruction exception) is different and architecture specific (AMD EPYC 7713). However, other kernels referenced this bug for the same error.

Rerunning this test to see if failure rate is 100%...

Edit: It failed exactly the same way. This is not a regression introduced in 2024.09.30 SRU cycle.

20:29:08 DEBUG| [stdout] FAIL: MSR_IA32_LASTBRANCHFROMIP, expected=0x401f4d, actual=0x401f4d
20:29:08 DEBUG| [stdout] PASS: Test that without LBRV enabled, guest LBR state does 'leak' to the host(1)
20:29:08 DEBUG| [stdout] Unhandled exception 6 #UD at ip 0000000000401750
20:29:08 DEBUG| [stdout] error_code=0000 rflags=00010086 cs=00000008
20:29:08 DEBUG| [stdout] rax=00000000004016db rcx=00000000000001dc rdx=80000000004016db rbx=0000000080010015
20:29:08 DEBUG| [stdout] rbp=000000000042fb68 rsi=000000000041776f rdi=0000000000414d40
20:29:08 DEBUG| [stdout] r8=000000000041776f r9=00000000000003f8 r10=000000000000000d r11=0000000000000000
20:29:08 DEBUG| [stdout] r12=0000000000000000 r13=0000000000000000 r14=0000000000000000 r15=0000000000000000
20:29:08 DEBUG| [stdout] cr0=0000000080010011 cr2=0000000000000000 cr3=00000000010bf000 cr4=0000000000040020
20:29:08 DEBUG| [stdout] cr8=0000000000000000
20:29:08 DEBUG| [stdout] STACK: @401750 4001d6 414df0 40bf1e 40bb72 4001d6 414df0 40bf1e 40bb72 4001d6 414df0 40bf1e 40bb72 4001d6 414df0 40bf1e 40bb72 4001d6 414df0 40bf1e
20:29:08 DEBUG| [stdout] FAIL svm

Revision history for this message
Koichiro Den (koichiroden) wrote :

"Unhandled exception 6 #UD" issue which Mehmet had written in comment#4 was also seen in 2024.10.28/noble/linux-lowlatency/6.8.0-50.51.1 on testflinger amd64 node bomberto.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.