Focal linux-azure: Vm crash on Dv5/Ev5

Bug #1950462 reported by Tim Gardner
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
Medium
Tim Gardner
linux-azure (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
Medium
Tim Gardner

Bug Description

SRU Justification

[Impact]

We are seeing a below crash for Nested VM scenario in Dv5/Ev5.

[ 284.769421] ------------[ cut here ]------------
[ 284.769422] KVM: accessing unsupported EVMCS field 2032
[ 284.769443] WARNING: CPU: 30 PID: 8426 at /build/linux-azure-5.4-YivnXz/linux-azure-5.4-5.4.0/arch/x86/kvm/vmx/evmcs.h:85 evmcs_write64+0x65/0x70 [kvm_intel]
[ 284.769443] Modules linked in: vhost_net vhost tap ipt_REJECT nf_reject_ipv4 xt_tcpudp iptable_filter xt_MASQUERADE iptable_nat nf_nat bridge stp llc xt_owner xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_security bpfilter udf crc_itu_t nls_iso8859_1 kvm_intel kvm serio_raw hv_balloon joydev sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic crct10dif_pclmul hid_hyperv crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd hyperv_fb cfbfillrect glue_helper cfbimgblt hid hv_netvsc hv_utils hyperv_keyboard cfbcopyarea
[ 284.769463] CPU: 30 PID: 8426 Comm: qemu-system-x86 Not tainted 5.4.0-1062-azure #65~18.04.1-Ubuntu
[ 284.769464] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 07/22/2021
[ 284.769467] RIP: 0010:evmcs_write64+0x65/0x70 [kvm_intel]
[ 284.769469] Code: c2 f7 d0 21 81 38 03 00 00 5d c3 80 3d 1c 32 03 00 00 75 f5 48 89 fe 48 c7 c7 f8 63 57 c0 c6 05 09 32 03 00 01 e8 eb d1 53 cd <0f> 0b 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 48 8b 07 80 b8 ea
[ 284.769469] RSP: 0018:ffffb75a03f0fb68 EFLAGS: 00010282
[ 284.769471] RAX: 0000000000000000 RBX: ffff8e126a9e8000 RCX: 0000000000000006
[ 284.769471] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff8e12dfb96580
[ 284.769472] RBP: ffffb75a03f0fb68 R08: 000000000000022b R09: 0000000000000004
[ 284.769472] R10: ffffb75a03f0fcf8 R11: 0000000000000001 R12: 000000000000001e
[ 284.769473] R13: fffffe00005fd000 R14: 0000000000000000 R15: 0000000000000000
[ 284.769474] FS: 00007f4bc4c09700(0000) GS:ffff8e12dfb80000(0000) knlGS:0000000000000000
[ 284.769476] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 284.769477] CR2: 00007f3fddb8eba0 CR3: 0000003f69dbe002 CR4: 0000000000372ee0
[ 284.769478] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 284.769478] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 284.769479] Call Trace:
[ 284.769485] vmx_vcpu_load_vmcs+0x2f9/0x440 [kvm_intel]
[ 284.769488] vmx_vcpu_load+0x47/0x200 [kvm_intel]
[ 284.769493] ? __memcg_kmem_charge+0x87/0x150
[ 284.769495] ? __alloc_pages_nodemask+0x246/0x320
[ 284.769499] vmx_create_vcpu+0x362/0x720 [kvm_intel]
[ 284.769500] ? __get_free_pages+0x11/0x40
[ 284.769504] ? alloc_loaded_vmcs+0xa2/0x120 [kvm_intel]
[ 284.769507] ? vmx_create_vcpu+0x362/0x720 [kvm_intel]
[ 284.769528] kvm_arch_vcpu_create+0x4f/0x70 [kvm]
[ 284.769538] kvm_vm_ioctl+0x2e2/0x980 [kvm]
[ 284.769542] do_vfs_ioctl+0xa9/0x640
[ 284.769545] ? __switch_to_asm+0x40/0x70
[ 284.769546] ? __switch_to_asm+0x34/0x70
[ 284.769547] ? __switch_to_asm+0x40/0x70
[ 284.769548] ? __switch_to_asm+0x34/0x70
[ 284.769550] ? __switch_to_asm+0x40/0x70
[ 284.769551] ? __switch_to_asm+0x34/0x70
[ 284.769552] ? __switch_to_asm+0x40/0x70
[ 284.769553] ? __switch_to_asm+0x34/0x70
[ 284.769554] ? __switch_to_asm+0x40/0x70
[ 284.769555] ksys_ioctl+0x75/0x80
[ 284.769556] ? __switch_to_asm+0x34/0x70
[ 284.769557] __x64_sys_ioctl+0x1a/0x20
[ 284.769559] do_syscall_64+0x5e/0x200
[ 284.769561] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 284.769562] RIP: 0033:0x7f4bcf01d317
[ 284.769563] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48
[ 284.769564] RSP: 002b:00007f4bc4c08888 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 284.769565] RAX: ffffffffffffffda RBX: 000000000000ae41 RCX: 00007f4bcf01d317
[ 284.769566] RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 000000000000000b
[ 284.769566] RBP: 0000000000000000 R08: 00005596f71e0ec0 R09: 00005596f896c170
[ 284.769567] R10: 00005596f77fb8e0 R11: 0000000000000246 R12: 00005596f892ae90
[ 284.769568] R13: 0000000000000000 R14: 00005596f896c170 R15: 00007fffa5dffce0
[ 284.769569] ---[ end trace 481983b25fa8f1f4 ]---
[ 284.795366] set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.

[Fix]

55d2eba8e7cd ("jump_label: Fix usage in module __init")
064eedf2c50f ("KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again")

[Test Case]

Create a nested VM on an Azure Dv5/Ev5 instance.

[Where things could go wrong]

KVM instance creation could fail in other unusual ways.

[Other info]

SF: #00322790

CVE References

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1950462

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu Focal):
status: New → Incomplete
tags: added: focal
Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Focal):
status: Incomplete → In Progress
importance: Undecided → Medium
assignee: nobody → Tim Gardner (timg-tpi)
Changed in linux (Ubuntu):
status: Incomplete → Fix Released
Changed in linux-azure (Ubuntu):
status: New → Fix Released
Changed in linux-azure (Ubuntu Focal):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Tim Gardner (timg-tpi)
Tim Gardner (timg-tpi)
tags: added: bot-stop-nagging
Changed in linux (Ubuntu Focal):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/5.4.0-92.103 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
Tim Gardner (timg-tpi) wrote :

Microsoft tested and approved.

tags: added: verification-done-focal
removed: verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (19.0 KiB)

This bug was fixed in the package linux - 5.4.0-92.103

---------------
linux (5.4.0-92.103) focal; urgency=medium

  * focal/linux: 5.4.0-92.103 -proposed tracker (LP: #1952316)

  * Packaging resync (LP: #1786013)
    - [Packaging] resync update-dkms-versions helper
    - debian/dkms-versions -- update from kernel-versions (main/2021.11.29)

  * CVE-2021-4002
    - tlb: mmu_gather: add tlb_flush_*_range APIs
    - hugetlbfs: flush TLBs correctly after huge_pmd_unshare

  * Re-enable DEBUG_INFO_BTF where it was disabled (LP: #1945632)
    - [Config] Enable CONFIG_DEBUG_INFO_BTF on all arches

  * Focal linux-azure: Vm crash on Dv5/Ev5 (LP: #1950462)
    - KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again
    - jump_label: Fix usage in module __init

  * Support builtin revoked certificates (LP: #1932029)
    - Revert "UBUNTU: SAUCE: (lockdown) Make get_cert_list() not complain about
      cert lists that aren't present."
    - integrity: Move import of MokListRT certs to a separate routine
    - integrity: Load certs from the EFI MOK config table
    - certs: Add ability to preload revocation certs
    - integrity: Load mokx variables into the blacklist keyring
    - certs: add 'x509_revocation_list' to gitignore
    - SAUCE: Dump stack when X.509 certificates cannot be loaded
    - [Packaging] build canonical-revoked-certs.pem from branch/arch certs
    - [Packaging] Revoke 2012 UEFI signing certificate as built-in
    - [Config] Configure CONFIG_SYSTEM_REVOCATION_KEYS with revoked keys

  * Support importing mokx keys into revocation list from the mok table
    (LP: #1928679)
    - efi: Support for MOK variable config table
    - efi: mokvar-table: fix some issues in new code
    - efi: mokvar: add missing include of asm/early_ioremap.h
    - efi/mokvar: Reserve the table only if it is in boot services data
    - SAUCE: integrity: add informational messages when revoking certs

  * Support importing mokx keys into revocation list from the mok table
    (LP: #1928679) // CVE-2020-26541 when certificates are revoked via
    MokListXRT.
    - SAUCE: integrity: Load mokx certs from the EFI MOK config table

  * Focal update: v5.4.157 upstream stable release (LP: #1951883)
    - ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned
    - ARM: 9134/1: remove duplicate memcpy() definition
    - ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype
    - ARM: 9141/1: only warn about XIP address when not compile testing
    - ipv6: use siphash in rt6_exception_hash()
    - ipv4: use siphash instead of Jenkins in fnhe_hashfun()
    - usbnet: sanity check for maxpacket
    - usbnet: fix error return code in usbnet_probe()
    - Revert "pinctrl: bcm: ns: support updated DT binding as syscon subnode"
    - ata: sata_mv: Fix the error handling of mv_chip_id()
    - nfc: port100: fix using -ERRNO as command type mask
    - net/tls: Fix flipped sign in tls_err_abort() calls
    - mmc: vub300: fix control-message timeouts
    - mmc: cqhci: clear HALT state after CQE enable
    - mmc: dw_mmc: exynos: fix the finding clock sample value
    - mmc: sdhci: Map more voltage level to SDHCI_POWER_330
    - mmc: sdhci-esdhc-imx: clear the buffe...

Changed in linux (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (19.7 KiB)

This bug was fixed in the package linux-azure - 5.4.0-1065.68

---------------
linux-azure (5.4.0-1065.68) focal; urgency=medium

  * focal/linux-azure: 5.4.0-1065.68 -proposed tracker (LP: #1952290)

  * Re-enable DEBUG_INFO_BTF where it was disabled (LP: #1945632)
    - [Config] azure: enable CONFIG_DEBUG_INFO_BTF

  * Support builtin revoked certificates (LP: #1932029)
    - [Config] azure: set CONFIG_SYSTEM_REVOCATION_KEYS

  * Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24
    (LP: #1952621)
    - PCI/sysfs: Convert "config" to static attribute

  * linux-azure: add Icelake servers support in no-HWP mode to
    cpufreq/intel_pstate driver (LP: #1952234)
    - cpufreq: intel_pstate: Add Icelake servers support in no-HWP mode

  [ Ubuntu: 5.4.0-92.103 ]

  * focal/linux: 5.4.0-92.103 -proposed tracker (LP: #1952316)
  * Packaging resync (LP: #1786013)
    - [Packaging] resync update-dkms-versions helper
    - debian/dkms-versions -- update from kernel-versions (main/2021.11.29)
  * CVE-2021-4002
    - tlb: mmu_gather: add tlb_flush_*_range APIs
    - hugetlbfs: flush TLBs correctly after huge_pmd_unshare
  * Re-enable DEBUG_INFO_BTF where it was disabled (LP: #1945632)
    - [Config] Enable CONFIG_DEBUG_INFO_BTF on all arches
  * Focal linux-azure: Vm crash on Dv5/Ev5 (LP: #1950462)
    - KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again
    - jump_label: Fix usage in module __init
  * Support builtin revoked certificates (LP: #1932029)
    - Revert "UBUNTU: SAUCE: (lockdown) Make get_cert_list() not complain about
      cert lists that aren't present."
    - integrity: Move import of MokListRT certs to a separate routine
    - integrity: Load certs from the EFI MOK config table
    - certs: Add ability to preload revocation certs
    - integrity: Load mokx variables into the blacklist keyring
    - certs: add 'x509_revocation_list' to gitignore
    - SAUCE: Dump stack when X.509 certificates cannot be loaded
    - [Packaging] build canonical-revoked-certs.pem from branch/arch certs
    - [Packaging] Revoke 2012 UEFI signing certificate as built-in
    - [Config] Configure CONFIG_SYSTEM_REVOCATION_KEYS with revoked keys
  * Support importing mokx keys into revocation list from the mok table
    (LP: #1928679)
    - efi: Support for MOK variable config table
    - efi: mokvar-table: fix some issues in new code
    - efi: mokvar: add missing include of asm/early_ioremap.h
    - efi/mokvar: Reserve the table only if it is in boot services data
    - SAUCE: integrity: add informational messages when revoking certs
  * Support importing mokx keys into revocation list from the mok table
    (LP: #1928679) // CVE-2020-26541 when certificates are revoked via
    MokListXRT.
    - SAUCE: integrity: Load mokx certs from the EFI MOK config table
  * Focal update: v5.4.157 upstream stable release (LP: #1951883)
    - ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned
    - ARM: 9134/1: remove duplicate memcpy() definition
    - ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype
    - ARM: 9141/1: only warn about XIP address when not compile testing
    - ipv6: use siphash in rt6_exception_hash()
    - i...

Changed in linux-azure (Ubuntu Focal):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers