5.15.0-1013-oracle: Unable to boot large memory SEV guest without setting swiotlb parameter.

Bug #1983625 reported by Awais Tanveer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

When launching a SEV Ubuntu22.04 guest with e.g. memory > 16876M, Ubuntu kernel
5.15.0-1013-oracle panics unless swiotlb=262144 is specified on guest kernel
parameters. It seems that the kernel tries to adjust swiotlb buffer size but
can not do that and crashes. With a memory size such as 8G, the guest boots fine.

HOST INFO

Host type : OCI Bare-Metal Server
Server/Machine: ORACLE SERVER E2-2c
CPU model : AMD EPYC 7742 64-Core Processor
Architecture : x86_64
Hostname : atanveer-amd-sme
OS : Oracle Linux Server release 7.9
Kernel : 5.4.17-2136.309.4.el7uek.x86_64 #2 SMP Tue Jun 28 17:35:13 PDT 2022
Hypervisor : QEMU emulator version 4.2.1 (qemu-4.2.1-18.oci.el7)
OVMF/AAVMF : OVMF-1.6.3-1.el7.noarch

Qemu command to launch SEV guest:

/bin/qemu-system-x86_64 -name OL22.04-uefi \
-machine q35 \
-enable-kvm \
-cpu host,+host-phys-bits \
-m 16877M \
-smp 8,maxcpus=240 \
-D ./22.04-uefi.log \
-nodefaults \
-monitor stdio \
-vnc 0.0.0.0:0,to=999 \
-vga std \
-drive file=/usr/share/OVMF/OVMF_CODE.pure-efi.fd,index=0,if=pflash,format=raw,readonly \
-drive file=OVMF_VARS.pure-efi.fd.ol22.04,index=1,if=pflash,format=raw \
-device virtio-scsi-pci,id=virtio-scsi-pci0,disable-legacy=on,iommu_platform=true \
-drive file=Ubuntu-22.04-2022.06.16-0-uefi-x86_64.qcow2,if=none,id=local_disk0,format=qcow2,media=disk \
-device ide-hd,drive=local_disk0,id=local_disk1,bootindex=0 \
-qmp tcp:127.0.0.1:3334,server,nowait \
-serial telnet:127.0.0.1:3333,server,nowait \
-device virtio-rng-pci,disable-legacy=on,iommu_platform=true \
-object sev-guest,id=sev0,cbitpos=47,reduced-phys-bits=1 \
-machine memory-encryption=sev0

Console log:

[ 0.005025] software IO TLB: SWIOTLB bounce buffer size adjusted to 1011MB
[ 0.033881] kvm-guest: KVM setup pv remote TLB flush
[ 0.054931] software IO TLB: Cannot allocate buffer
[ 0.248933] Last level iTLB entries: 4KB 512, 2MB 255, 4MB 127
[ 0.249582] Last level dTLB entries: 4KB 512, 2MB 255, 4MB 127, 1GB 0
[ 0.317440] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages
[ 0.317440] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[ 0.424952] iommu: DMA domain TLB invalidation policy: lazy mode
[ 0.570923] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[ 0.571669] software IO TLB: No low mem
.
.
.
.
[ 1.233515] ata1: SATA max UDMA/133 abar m4096@0xc1010000 port 0xc1010100 irq 28
[ 1.234985] ata2: SATA max UDMA/133 abar m4096@0xc1010000 port 0xc1010180 irq 28
[ 1.236464] ata3: SATA max UDMA/133 abar m4096@0xc1010000 port 0xc1010200 irq 28
[ 1.237863] ata4: SATA max UDMA/133 abar m4096@0xc1010000 port 0xc1010280 irq 28
[ 1.239257] ata5: SATA max UDMA/133 abar m4096@0xc1010000 port 0xc1010300 irq 28
[ 1.240659] ata6: SATA max UDMA/133 abar m4096@0xc1010000 port 0xc1010380 irq 28
[ 1.555165] ata5: SATA link down (SStatus 0 SControl 300)
[ 1.556661] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 1.558232] ata2: SATA link down (SStatus 0 SControl 300)
[ 1.559728] ata4: SATA link down (SStatus 0 SControl 300)
[ 1.560996] ata3: SATA link down (SStatus 0 SControl 300)
[ 1.562134] ata1: SATA link down (SStatus 0 SControl 300)
[ 1.563566] ata6.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
[ 6.911450] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 6.912906] ata6.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
[ 12.288045] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 12.289904] ata6.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
[ 17.663548] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 242.883993] INFO: task kworker/u480:0:9 blocked for more than 120 seconds.
[ 242.885619] Not tainted 5.15.0-1013-oracle #17-Ubuntu
[ 242.886743] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 242.888198] task:kworker/u480:0 state:D stack: 0 pid: 9 ppid: 2 flags:0x00004000
[ 242.889703] Workqueue: events_unbound async_run_entry_fn
[ 242.890882] Call Trace:
[ 242.891727] <TASK>

Full console log is attached.

Revision history for this message
Awais Tanveer (awaistanveer) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1983625

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Awais Tanveer (awaistanveer) wrote (last edit ):

Can not run apport-collect logs since the SEV guest instance fails to boot.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Fabio Augusto Miranda Martins (fabio.martins) wrote :

This is a grub bug and it is being tracked here:

[SRU] unable to boot guest with large memory when SEV is enabled on host
https://bugs.launchpad.net/ubuntu/+source/grub2-unsigned/+bug/1989446

Changed in linux (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.