Boot error on Jammy on the 6.2 HWE kernel (Lunar) with direct IO if virtual block size < host block size

Bug #2025591 reported by Chengen Du
20
This bug affects 1 person
Affects Status Importance Assigned to Milestone
qemu (Ubuntu)
Invalid
Undecided
Unassigned
Jammy
Fix Released
High
Chengen Du
Kinetic
Won't Fix
Undecided
Unassigned
Lunar
Invalid
Undecided
Unassigned

Bug Description

[Impact]

 * Failure to boot VMs on Jammy with the HWE 6.2 kernel
   (from Lunar) when using direct IO (e.g., cache=none)
   if the virtual block device's block size is smaller
   than the host device/file's block size.

 * The issue might become increasingly more common with
   storage with 4k sector size, and as Jammy/22.04 ages
   and users go to newer/HWE kernels for newer hardware.

[Fix]

 * When the logical block size of the virtual block device
   is smaller than the block device's it is backed by on
   the host, qemu encounters a situation where it needs to
   bounce unaligned buffers during the use of direct IO.

   In the past, the logical block size happened to align
   with the memory page offset, leading qemu to mistakenly
   use the memory offset as the block size.

   However, a kernel commit b1a000d3b8ec (in Linux v6.0)
   resolved this issue by separating memory alignment
   from the logical block size.

   As a result, qemu now has an incorrect understanding
   of the minimum vector size.

   The qemu commit 25474d90aa50 ("block: use the request
   length for iov alignment") fixes this (in QEMU v7.2).

[Test Plan]

 * Run qemu with a block device (default block size: 512)
   backed by a loop device with block size of 4096 bytes,
   without cache (ie, direct IO) on Jammy with HWE kernel:

   LOOPDEV=$(losetup --find --show --sector-size 4096 jammy.raw)

   qemu-system-x86_64 -drive file=$LOOPDEV,format=raw,cache=none \
     -boot order=c -nodefaults -no-user-config \
     -nographic -serial stdio -enable-kvm

   Expected:

 # qemu-system-x86_64 ...
 SeaBIOS (version 1.15.0-1)
 Booting from Hard Disk...
 GRUB_FORCE_PARTUUID set, initrdless boot failed. Attempting with initrd.
 Linux version <...>
 ...

   Actual:

 # qemu-system-x86_64 ...
 SeaBIOS (version 1.15.0-1)
 Booting from Hard Disk...
 Boot failed: could not read the boot disk

 Booting from Floppy...
 Boot failed: could not read the boot disk

 No bootable device.

[Where problems could occur]

 * Potential regressions would likely manifest in QEMU
   file I/O path, possibly with errors or performance
   differences due to the change in alignment detection.

   These should be easy to test on early testing with
   a relatively small test matrix:
   - (host) kernel: GA (5.15) and HWE (6.2)
   - (host) block size 512 and 4096 bytes

   An incremental patch for tracing the old/new value
   used by QEMU (changed by the fix) will be used for
   verification/debugging purposes.

[Other Info]

 * Kinetic is affected (QEMU 7.0 < 7.2) but will not
   be fixed due to EOL in ~2 weeks and Lunar (upgrade)
   is fixed.

Chengen Du (chengendu)
Changed in qemu (Ubuntu):
assignee: nobody → ChengEn, Du (chengendu)
Changed in qemu (Ubuntu Jammy):
assignee: nobody → ChengEn, Du (chengendu)
Revision history for this message
Chengen Du (chengendu) wrote :

Attached is a patch that resolves the issue on Jammy.

Changed in qemu (Ubuntu Jammy):
status: New → In Progress
Changed in qemu (Ubuntu):
status: New → In Progress
Chengen Du (chengendu)
tags: added: sts-sponsor sts-sru-needed
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "jammy_use_the_request_length_for_iov_alignment.debdiff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
tags: added: se-sponsor-mfo
removed: sts-sponsor sts-sru-needed
description: updated
Changed in qemu (Ubuntu Jammy):
importance: Undecided → High
Changed in qemu (Ubuntu Kinetic):
status: New → Won't Fix
Changed in qemu (Ubuntu Lunar):
status: New → Invalid
Changed in qemu (Ubuntu):
status: In Progress → Invalid
assignee: ChengEn, Du (chengendu) → nobody
summary: - Align the iov length to the logical block size
+ Boot error on Jammy on the 6.2 HWE kernel (Lunar) with direct IO if
+ virtual block size < host block size
description: updated
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Detailed test steps with output:

 lxc launch --vm ubuntu:jammy jammy-vm
 lxc shell jammy-vm

 apt install --yes --no-install-recommends qemu-system-x86 qemu-utils

 wget https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64-disk-kvm.img -O jammy.img
 qemu-img convert jammy.img jammy.raw

 LOOPDEV=$(losetup --find --show --sector-size 4096 jammy.raw)
 qemu-system-x86_64 -drive file=$LOOPDEV,format=raw,cache=none -boot order=c -nodefaults -no-user-config -nographic -serial stdio -enable-kvm

Default ("GA") kernel:

 # uname -r
 5.15.0-1035-kvm

 # qemu-system-x86_64 ...
 SeaBIOS (version 1.15.0-1)
 Booting from Hard Disk...
 GRUB_FORCE_PARTUUID set, initrdless boot failed. Attempting with initrd.
 Linux version <...>
 ...

 Ubuntu 22.04.2 LTS ubuntu ttyS0

 ubuntu login:

HWE kernel (lunar):

 # add-apt-repository -y -p proposed
 # apt install --yes --no-install-recommends linux-generic-hwe-22.04-edge
 # reboot

 # uname -r
 6.2.0-23-generic

 # qemu-system-x86_64 ...
 SeaBIOS (version 1.15.0-1)
 Booting from Hard Disk...
 Boot failed: could not read the boot disk

 Booting from Floppy...
 Boot failed: could not read the boot disk

 No bootable device.

With the patch:

 # qemu-system-x86_64 ...
 SeaBIOS (version 1.15.0-1)
 Booting from Hard Disk...
 GRUB_FORCE_PARTUUID set, initrdless boot failed. Attempting with initrd.
 Linux version <...>
 ...

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Notes on the related linux/qemu commits, and Ubuntu releases with them, for documentation purposes.

The kernel commit introducing the issue on QEMU is in Linux v6.0:

 $ git remote get-url origin
 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

 $ git describe --contains b1a000d3b8ec
 v6.0-rc1~177^2~69

It's available on Lunar (v6.2-based) and later:

 $ rmadison --arch source linux -s jammy,kinetic,lunar,mantic
  linux | 5.15.0-25.25 | jammy | source
  linux | 5.19.0-21.21 | kinetic | source
  linux | 6.2.0-20.20 | lunar | source
  linux | 6.3.0-7.7 | mantic | source

The 6.2 kernel is available on Jammy via HWE kernel from Lunar:

 $ rmadison --arch source linux-hwe-6.2
  linux-hwe-6.2 | 6.2.0-23.23~22.04.1 | jammy-proposed | source

The QEMU commit addressing the issue is in QEMU v7.2.0:

 $ git remote get-url origin
 https://gitlab.com/qemu-project/qemu.git
 $ git describe --contains 25474d90aa50
 v7.2.0-rc0~65^2~5

It's available in Lunar and later:

 $ rmadison --arch source qemu
 ...
  qemu | 1:6.2+dfsg-2ubuntu6 | jammy | source
  qemu | 1:6.2+dfsg-2ubuntu6.11 | jammy-security | source
  qemu | 1:6.2+dfsg-2ubuntu6.11 | jammy-updates | source
  qemu | 1:7.0+dfsg-7ubuntu2 | kinetic | source
  qemu | 1:7.0+dfsg-7ubuntu2.6 | kinetic-security | source
  qemu | 1:7.0+dfsg-7ubuntu2.6 | kinetic-updates | source
  qemu | 1:7.2+dfsg-5ubuntu2 | lunar | source
  qemu | 1:7.2+dfsg-5ubuntu2.2 | lunar-security | source
  qemu | 1:7.2+dfsg-5ubuntu2.2 | lunar-updates | source
  qemu | 1:7.2+dfsg-5ubuntu3 | mantic | source
  qemu | 1:8.0.2+dfsg-2ubuntu1 | mantic-proposed | source

Note that Kinetic is affected (QEMU 7.0), but it's very close to EOL:

 $ ubuntu-distro-info --days=eol --series=kinetic
 17

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Download full text (4.9 KiB)

Tests with the incremental debug patch (below)
to determine which scenarios have any changes
with the fix patch applied:

Results:

The only behavior change occurs with the HWE/6.2 kernel
when the 4096-bytes size is used in the host.

To be clear, only the *broken* case changed, and is now *fixed*:

 GA/5.15 kernel, 512 bytes: old = 512, new = 512 (no change)
 GA/5.15 kernel, 4096 bytes: old = 4096, new = 4096 (no change)

 HWE/6.2 kernel, 512 bytes: old = 512, new = 512 (no change)
 HWE/6.2 kernel, 4096 bytes: old = 512, new = 4096 (changed; fixed)

More details below.

Debug patch:
(qemu trace event showing the old/new value changed by [1])

 $ quilt diff
 Index: qemu-6.2+dfsg/block/io.c
 ===================================================================
 --- qemu-6.2+dfsg.orig/block/io.c
 +++ qemu-6.2+dfsg/block/io.c
 @@ -3244,6 +3244,8 @@ bool bdrv_qiov_is_aligned(BlockDriverSta
      size_t alignment = bdrv_min_mem_align(bs);
      size_t len = bs->bl.request_alignment;

 + trace_bdrv_qiov_is_aligned(alignment, len);
 +
      for (i = 0; i < qiov->niov; i++) {
   if ((uintptr_t) qiov->iov[i].iov_base % alignment) {
       return false;
 Index: qemu-6.2+dfsg/block/trace-events
 ===================================================================
 --- qemu-6.2+dfsg.orig/block/trace-events
 +++ qemu-6.2+dfsg/block/trace-events
 @@ -14,6 +14,7 @@ blk_root_detach(void *child, void *blk,
  bdrv_co_preadv_part(void *bs, int64_t offset, int64_t bytes, unsigned int flags) "bs %p offset %" PRId64 " bytes %" PRId64 " flags 0x%x"
  bdrv_co_pwritev_part(void *bs, int64_t offset, int64_t bytes, unsigned int flags) "bs %p offset %" PRId64 " bytes %" PRId64 " flags 0x%x"
  bdrv_co_pwrite_zeroes(void *bs, int64_t offset, int64_t bytes, int flags) "bs %p offset %" PRId64 " bytes %" PRId64 " flags 0x%x"
 +bdrv_qiov_is_aligned(size_t alignment, size_t len) "alignment %zu len %zu"
  bdrv_co_do_copy_on_readv(void *bs, int64_t offset, int64_t bytes, int64_t cluster_offset, int64_t cluster_bytes) "bs %p offset %" PRId64 " bytes %" PRId64 " cluster_offset %" PRId64 " cluster_bytes %" PRId64
  bdrv_co_copy_range_from(void *src, int64_t src_offset, void *dst, int64_t dst_offset, int64_t bytes, int read_flags, int write_flags) "src %p offset %" PRId64 " dst %p offset %" PRId64 " bytes %" PRId64 " rw flags 0x%x 0x%x"
  bdrv_co_copy_range_to(void *src, int64_t src_offset, void *dst, int64_t dst_offset, int64_t bytes, int read_flags, int write_flags) "src %p offset %" PRId64 " dst %p offset %" PRId64 " bytes %" PRId64 " rw flags 0x%x 0x%x"

Testing:

 # qemu-system-x86_64 -trace help | grep bdrv_qiov_is_aligned
 bdrv_qiov_is_aligned

Details:

GA/5.15 kernel, 512 bytes: old = 512, new = 512.

 # uname -r
 5.15.0-1035-kvm

 # LOOPDEV=$(losetup --find --show jammy.raw)
 # qemu-system-x86_64 -drive file=$LOOPDEV,format=raw,cache=none -boot order=c -nodefaults -no-user-config -nographic -serial stdio -enable-kvm \
   -trace bdrv_qiov_is_aligned 2>&1 | tee qemu-5.15-512.log
 ...

 # grep 'login:' qemu-5.15-512.log
  login: bdrv_qiov_is_aligned alignment 512 len 512

 # cat qemu-5.15-512.log | grep -o 'bdrv_qiov_is_aligned alignment [0-9]\+ len [0-9]\+' | s...

Read more...

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hi ChengEn,

Thanks for the bug report, analysis, and debdiff!

Things looked mostly good! I would have asked for some changes
in the SRU template and debdiff, but as you mentioned this is
high priority and time sensitive (which I do agree with, after
reviewing our internal reference), I went ahead and did those.

I'm attached the updated debdiff for reference and done tests
as detailed in the previous comments.

Uploaded to Jammy!

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote (last edit ):

Ah, the package with the patch has built correctly in PPA [1]
for amd64, arm64, armhf, ppc64el, and s390x.

[1] https://launchpad.net/~mfo/+archive/ubuntu/lp2025591

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Revision history for this message
Robie Basak (racb) wrote : Please test proposed package

Hello ChengEn, or anyone else affected,

Accepted qemu into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/qemu/1:6.2+dfsg-2ubuntu6.12 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in qemu (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-jammy
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Verification done for jammy-proposed.

QEMU can boot successfully from a file backed by 4096-byte sectors on the 6.2 HWE kernel.
No regressions with the other options (ie, 512 on 6.2 kernel and 4096/512 on 5.15 kernel).

 $ lsb_release -cs
 jammy

 $ apt policy qemu-system-x86
 qemu-system-x86:
   Installed: 1:6.2+dfsg-2ubuntu6.12
   Candidate: 1:6.2+dfsg-2ubuntu6.12
   Version table:
  *** 1:6.2+dfsg-2ubuntu6.12 500
  500 http://security.ubuntu.com/ubuntu jammy-proposed/main amd64 Packages
  500 http://archive.ubuntu.com/ubuntu jammy-proposed/main amd64 Packages
  100 /var/lib/dpkg/status
 ...

Detailed test steps in comment #3.

6.2 HWE kernel:
---

 # uname -r
 6.2.0-23-generic

 # LOOPDEV=$(losetup --find --show --sector-size 4096 jammy.raw)
 # qemu-system-x86_64 -drive file=$LOOPDEV,format=raw,cache=none -boot order=c -nodefaults -no-user-config -nographic -serial stdio -enable-kvm

 SeaBIOS (version 1.15.0-1)
 Booting from Hard Disk...
 GRUB_FORCE_PARTUUID set, initrdless boot failed. Attempting with initrd.
 Linux version <...>

 # losetup -d $LOOPDEV
 # LOOPDEV=$(losetup --find --show jammy.raw)
 # qemu-system-x86_64 -drive file=$LOOPDEV,format=raw,cache=none -boot order=c -nodefaults -no-user-config -nographic -serial stdio -enable-kvm

 SeaBIOS (version 1.15.0-1)
 Booting from Hard Disk...
 GRUB_FORCE_PARTUUID set, initrdless boot failed. Attempting with initrd.
 Linux version <...>

5.15 GA kernel
---

 # uname -r
 5.15.0-1035-kvm

 # LOOPDEV=$(losetup --find --show --sector-size 4096 jammy.raw)
 # qemu-system-x86_64 -drive file=$LOOPDEV,format=raw,cache=none -boot order=c -nodefaults -no-user-config -nographic -serial stdio -enable-kvm

 SeaBIOS (version 1.15.0-1)
 Booting from Hard Disk...
 GRUB_FORCE_PARTUUID set, initrdless boot failed. Attempting with initrd.
 Linux version <...>

 # losetup -d $LOOPDEV
 # LOOPDEV=$(losetup --find --show jammy.raw)
 # qemu-system-x86_64 -drive file=$LOOPDEV,format=raw,cache=none -boot order=c -nodefaults -no-user-config -nographic -serial stdio -enable-kvm

 SeaBIOS (version 1.15.0-1)
 Booting from Hard Disk...
 GRUB_FORCE_PARTUUID set, initrdless boot failed. Attempting with initrd.
 Linux version <...>

tags: added: verification-done verification-done-jammy
removed: se-sponsor-mfo verification-needed verification-needed-jammy
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (qemu/1:6.2+dfsg-2ubuntu6.12)

All autopkgtests for the newly accepted qemu (1:6.2+dfsg-2ubuntu6.12) for jammy have finished running.
The following regressions have been reported in tests triggered by the package:

edk2/2022.02-3ubuntu0.22.04.1 (arm64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/jammy/update_excuses.html#qemu

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

edk2/2022.02-3ubuntu0.22.04.1 (arm64)

817s FAIL: test_ovmf_4m_secboot (__main__.BootToShellTest)
...
817s pexpect.exceptions.TIMEOUT: Timeout exceeded.

Retrying.

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Passed.

edk2 [jammy/arm64]

2022.02-3ubuntu0.22.04.1 qemu/1:6.2+dfsg-2ubuntu6.12 2023-07-06 12:15:13 UTC 0h 15m 11s mfo pass

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu - 1:6.2+dfsg-2ubuntu6.12

---------------
qemu (1:6.2+dfsg-2ubuntu6.12) jammy; urgency=medium

  [ Chengen Du ]
  * d/p/u/lp2025591-block-use-the-request-length-for-iov-alignment.patch:
    Fix boot error on the HWE 6.2 kernel with direct IO (eg, cache=none)
    if the logical block size is smaller than in the host (LP: #2025591)

 -- Mauricio Faria de Oliveira <email address hidden> Mon, 03 Jul 2023 18:00:25 -0300

Changed in qemu (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Andreas Hasenack (ahasenack) wrote : Update Released

The verification of the Stable Release Update for qemu has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.