Boot error on Jammy on the 6.2 HWE kernel (Lunar) with direct IO if virtual block size < host block size
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
qemu (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Jammy |
Fix Released
|
High
|
Chengen Du | ||
Kinetic |
Won't Fix
|
Undecided
|
Unassigned | ||
Lunar |
Invalid
|
Undecided
|
Unassigned |
Bug Description
[Impact]
* Failure to boot VMs on Jammy with the HWE 6.2 kernel
(from Lunar) when using direct IO (e.g., cache=none)
if the virtual block device's block size is smaller
than the host device/file's block size.
* The issue might become increasingly more common with
storage with 4k sector size, and as Jammy/22.04 ages
and users go to newer/HWE kernels for newer hardware.
[Fix]
* When the logical block size of the virtual block device
is smaller than the block device's it is backed by on
the host, qemu encounters a situation where it needs to
bounce unaligned buffers during the use of direct IO.
In the past, the logical block size happened to align
with the memory page offset, leading qemu to mistakenly
use the memory offset as the block size.
However, a kernel commit b1a000d3b8ec (in Linux v6.0)
resolved this issue by separating memory alignment
from the logical block size.
As a result, qemu now has an incorrect understanding
of the minimum vector size.
The qemu commit 25474d90aa50 ("block: use the request
length for iov alignment") fixes this (in QEMU v7.2).
[Test Plan]
* Run qemu with a block device (default block size: 512)
backed by a loop device with block size of 4096 bytes,
without cache (ie, direct IO) on Jammy with HWE kernel:
LOOPDEV=
qemu-
-boot order=c -nodefaults -no-user-config \
-nographic -serial stdio -enable-kvm
Expected:
# qemu-system-x86_64 ...
SeaBIOS (version 1.15.0-1)
Booting from Hard Disk...
GRUB_FORCE_
Linux version <...>
...
Actual:
# qemu-system-x86_64 ...
SeaBIOS (version 1.15.0-1)
Booting from Hard Disk...
Boot failed: could not read the boot disk
Booting from Floppy...
Boot failed: could not read the boot disk
No bootable device.
[Where problems could occur]
* Potential regressions would likely manifest in QEMU
file I/O path, possibly with errors or performance
differences due to the change in alignment detection.
These should be easy to test on early testing with
a relatively small test matrix:
- (host) kernel: GA (5.15) and HWE (6.2)
- (host) block size 512 and 4096 bytes
An incremental patch for tracing the old/new value
used by QEMU (changed by the fix) will be used for
verification
[Other Info]
* Kinetic is affected (QEMU 7.0 < 7.2) but will not
be fixed due to EOL in ~2 weeks and Lunar (upgrade)
is fixed.
Changed in qemu (Ubuntu): | |
assignee: | nobody → ChengEn, Du (chengendu) |
Changed in qemu (Ubuntu Jammy): | |
assignee: | nobody → ChengEn, Du (chengendu) |
tags: | added: sts-sponsor sts-sru-needed |
tags: |
added: se-sponsor-mfo removed: sts-sponsor sts-sru-needed |
description: | updated |
Changed in qemu (Ubuntu Jammy): | |
importance: | Undecided → High |
Changed in qemu (Ubuntu Kinetic): | |
status: | New → Won't Fix |
Changed in qemu (Ubuntu Lunar): | |
status: | New → Invalid |
Changed in qemu (Ubuntu): | |
status: | In Progress → Invalid |
assignee: | ChengEn, Du (chengendu) → nobody |
summary: |
- Align the iov length to the logical block size + Boot error on Jammy on the 6.2 HWE kernel (Lunar) with direct IO if + virtual block size < host block size |
description: | updated |
Attached is a patch that resolves the issue on Jammy.