5.15.0-85 live migration regression

Bug #2036675 reported by Jay Vosburgh
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
Undecided
Unassigned
Jammy
Fix Released
High
Chengen Du

Bug Description

Fixes added for LP 2032164 [0] to resolve an issue in live migration have
unfortunately introduced a regression, causing a previously working live
migration pattern to fail when tested with the 5.15.0-85 kernel from -proposed.

Specifically, live migration from a PKRU-enabled host running a kernel older
than 5.15.0-85 to a host running the 5.15.0-85 kernel will fail. The
destination can be either with or without PKRU; both cases fail, although
in different ways (one hangs, the other fails due to a PCID flag issue).

The commits in question are

commit fa9225d64f215e8109de10f6b6c7a08f033d0ec0
Author: Dr. David Alan Gilbert <email address hidden>
Date: Mon Aug 21 14:47:28 2023 +0800

    KVM: x86: Always enable legacy FP/SSE in allowed user XFEATURES

commit 27a189b881278c8ad9c16b0ee05668d724352733
Author: Leonardo Bras <email address hidden>
Date: Mon Aug 21 14:47:27 2023 +0800

    x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0

[0] https://bugs.launchpad.net/bugs/2032164

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 2036675

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Chengen Du (chengendu)
Changed in linux (Ubuntu Jammy):
status: New → In Progress
assignee: nobody → Chengen Du (chengendu)
Stefan Bader (smb)
Changed in linux (Ubuntu Jammy):
importance: Undecided → High
status: In Progress → Fix Committed
Revision history for this message
Thomas Lamprecht (t-lamprecht) wrote (last edit ):

FYI, we solved a similar sounding issue by clearing the PKRU bit if it was set in the to-be-restored XSAVE blob, but not available in the current vCPU feature flags:

https://git.proxmox.com/?p=pve-kernel.git;a=blob;f=patches/kernel/0008-kvm-xsave-set-mask-out-PKRU-bit-in-xfeatures-if-vCPU.patch;h=dda75b87de9a18ecb3a51d01d487727db77825e7;hb=1559d22f3510d8532aa756fe5d69c74c76211e1e

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (56.7 KiB)

This bug was fixed in the package linux - 5.15.0-86.96

---------------
linux (5.15.0-86.96) jammy; urgency=medium

  * jammy/linux: 5.15.0-86.96 -proposed tracker (LP: #2036575)

  * 5.15.0-85 live migration regression (LP: #2036675)
    - Revert "KVM: x86: Always enable legacy FP/SSE in allowed user XFEATURES"
    - Revert "x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0"

  * Regression for ubuntu_bpf test build on Jammy 5.15.0-85.95 (LP: #2035181)
    - selftests/bpf: fix static assert compilation issue for test_cls_*.c

  * `refcount_t: underflow; use-after-free.` on hidon w/ 5.15.0-85-generic
    (LP: #2034447)
    - crypto: rsa-pkcs1pad - Use helper to set reqsize

linux (5.15.0-85.95) jammy; urgency=medium

  * jammy/linux: 5.15.0-85.95 -proposed tracker (LP: #2033821)

  * Please enable Renesas RZ platform serial installer (LP: #2022361)
    - [Config] enable hihope RZ/G2M serial console
    - [Config] Mark sh-sci as built-in

  * Request backport of xen timekeeping performance improvements (LP: #2033122)
    - x86/xen/time: prefer tsc as clocksource when it is invariant

  * kdump doesn't work with UEFI secure boot and kernel lockdown enabled on
    ARM64 (LP: #2033007)
    - [Config]: Enable CONFIG_KEXEC_IMAGE_VERIFY_SIG
    - kexec, KEYS: make the code in bzImage64_verify_sig generic
    - arm64: kexec_file: use more system keyrings to verify kernel image signature

  * ubuntu_kernel_selftests:net:vrf-xfrm-tests.sh: 8 failed test cases on
    jammy/fips (LP: #2019880)
    - selftests: net: vrf-xfrm-tests: change authentication and encryption algos

  * ubuntu_kernel_selftests:net:tls: 88 failed test cases on jammy/fips
    (LP: #2019868)
    - selftests/harness: allow tests to be skipped during setup
    - selftests: net: tls: check if FIPS mode is enabled

  * A general-proteciton exception during guest migration to unsupported PKRU
    machine (LP: 2032164, reverted)
    - x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0
    - KVM: x86: Always enable legacy FP/SSE in allowed user XFEATURES

  * CVE-2023-4569
    - netfilter: nf_tables: deactivate catchall elements in next generation

  * CVE-2023-20569
    - x86/cpu, kvm: Add support for CPUID_80000021_EAX
    - x86/srso: Add a Speculative RAS Overflow mitigation
    - x86/srso: Add IBPB_BRTYPE support
    - x86/srso: Add SRSO_NO support
    - x86/srso: Add IBPB
    - x86/srso: Add IBPB on VMEXIT
    - x86/srso: Fix return thunks in generated code
    - x86/srso: Tie SBPB bit setting to microcode patch detection
    - x86: fix backwards merge of GDS/SRSO bit
    - x86/srso: Fix build breakage with the LLVM linker
    - x86/cpu: Fix __x86_return_thunk symbol type
    - x86/cpu: Fix up srso_safe_ret() and __x86_return_thunk()
    - x86/alternative: Make custom return thunk unconditional
    - objtool: Add frame-pointer-specific function ignore
    - x86/ibt: Add ANNOTATE_NOENDBR
    - x86/cpu: Clean up SRSO return thunk mess
    - x86/cpu: Rename original retbleed methods
    - x86/cpu: Rename srso_(.*)_alias to srso_alias_\1
    - x86/cpu: Cleanup the untrain mess
    - x86/srso: Explain the untraining sequences a bit more
    - x86/static_call:...

Changed in linux (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-azure/5.15.0-1050.57 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-azure' to 'verification-done-jammy-linux-azure'. If the problem still exists, change the tag 'verification-needed-jammy-linux-azure' to 'verification-failed-jammy-linux-azure'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-azure-v2 verification-needed-jammy-linux-azure
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-aws/5.15.0-1048.53 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-aws' to 'verification-done-jammy-linux-aws'. If the problem still exists, change the tag 'verification-needed-jammy-linux-aws' to 'verification-failed-jammy-linux-aws'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-aws-v2 verification-needed-jammy-linux-aws
Chengen Du (chengendu)
tags: added: verification-done-jammy-linux-aws verification-done-jammy-linux-azure
removed: verification-needed-jammy-linux-aws verification-needed-jammy-linux-azure
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-nvidia-tegra/5.15.0-1018.18 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-nvidia-tegra' to 'verification-done-jammy-linux-nvidia-tegra'. If the problem still exists, change the tag 'verification-needed-jammy-linux-nvidia-tegra' to 'verification-failed-jammy-linux-nvidia-tegra'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-nvidia-tegra-v2 verification-needed-jammy-linux-nvidia-tegra
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-raspi/5.15.0-1040.43 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-raspi' to 'verification-done-jammy-linux-raspi'. If the problem still exists, change the tag 'verification-needed-jammy-linux-raspi' to 'verification-failed-jammy-linux-raspi'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-raspi-v2 verification-needed-jammy-linux-raspi
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-bluefield/5.15.0-1027.29 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-bluefield' to 'verification-done-jammy-linux-bluefield'. If the problem still exists, change the tag 'verification-needed-jammy-linux-bluefield' to 'verification-failed-jammy-linux-bluefield'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-bluefield-v2 verification-needed-jammy-linux-bluefield
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-intel-iotg-5.15/5.15.0-1043.49~20.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal-linux-intel-iotg-5.15' to 'verification-done-focal-linux-intel-iotg-5.15'. If the problem still exists, change the tag 'verification-needed-focal-linux-intel-iotg-5.15' to 'verification-failed-focal-linux-intel-iotg-5.15'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-focal-linux-intel-iotg-5.15-v2 verification-needed-focal-linux-intel-iotg-5.15
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-nvidia-tegra-igx/5.15.0-1005.5 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-nvidia-tegra-igx' to 'verification-done-jammy-linux-nvidia-tegra-igx'. If the problem still exists, change the tag 'verification-needed-jammy-linux-nvidia-tegra-igx' to 'verification-failed-jammy-linux-nvidia-tegra-igx'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-nvidia-tegra-igx-v2 verification-needed-jammy-linux-nvidia-tegra-igx
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-nvidia-tegra-5.15/5.15.0-1018.18~20.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal-linux-nvidia-tegra-5.15' to 'verification-done-focal-linux-nvidia-tegra-5.15'. If the problem still exists, change the tag 'verification-needed-focal-linux-nvidia-tegra-5.15' to 'verification-failed-focal-linux-nvidia-tegra-5.15'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-focal-linux-nvidia-tegra-5.15-v2 verification-needed-focal-linux-nvidia-tegra-5.15
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-gcp-tcpx/5.15.0-1002.2 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal-linux-gcp-tcpx' to 'verification-done-focal-linux-gcp-tcpx'. If the problem still exists, change the tag 'verification-needed-focal-linux-gcp-tcpx' to 'verification-failed-focal-linux-gcp-tcpx'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-focal-linux-gcp-tcpx-v2 verification-needed-focal-linux-gcp-tcpx
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-xilinx-zynqmp/5.15.0-1025.29 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-xilinx-zynqmp' to 'verification-done-jammy-linux-xilinx-zynqmp'. If the problem still exists, change the tag 'verification-needed-jammy-linux-xilinx-zynqmp' to 'verification-failed-jammy-linux-xilinx-zynqmp'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-xilinx-zynqmp-v2 verification-needed-jammy-linux-xilinx-zynqmp
Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :

I'm not sure if this has been really fixed in Jammy default kernel. I've hit specifically this issue during this week while running 5.15.0-89. Upgrading to HWE kernel or 5.19.0-50 does fix live migrations.

Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :

Folks, is there any way to revive the patch and see actual source code of the change proposed here?
As I assume the bug was closed due to inactivity, but I'm quite ready to test it out.

Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote (last edit ):

Eventually, I see failures even with kernel 5.15.0-84-generic which means there's smth else involved then commits in question, as they shouldn't be there yet.

Revision history for this message
Chengen Du (chengendu) wrote :

This bug only involves a revert to LP#2032164. We are actively working on finding a comprehensive solution to address the live migration issue.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-mtk/5.15.0-1030.34 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-mtk' to 'verification-done-jammy-linux-mtk'. If the problem still exists, change the tag 'verification-needed-jammy-linux-mtk' to 'verification-failed-jammy-linux-mtk'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-mtk-v2 verification-needed-jammy-linux-mtk
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.