Emerald Rapids cannot be used as Sapphire Rapids on Ubuntu due to TSX features

Bug #2106791 reported by DUFOUR Olivier
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
New
Undecided
Unassigned

Bug Description

[Environment]
Tested platform and environments :
* Ubuntu Jammy 22.04 LTS : 8.0.0-1ubuntu7.10 and 10.0.0-2ubuntu8.5~cloud0 (Caracal UCA) with HWE kernel (6.8)
* Ubuntu Noble 24.04 LTS : 10.0.0-2ubuntu8.6
* Ubuntu Oracular 24.10 : 10.6.0-1ubuntu3.2

Hardware :
* HPE DL360 with Intel Xeon Gold 6542Y

[Issue]
CPU is being recognised as Broadwell, thus missing either Skylake, Cascadelake, Icelake, SapphireRapids features.

[Impact]
It impacts deployments for any customers using Openstack with Nova and using any recent Intel CPU like Sapphire Rapids, Emerald Rapids, Granite Rapids and will prevent the user from using any instruction from anything more recent than Broadwell CPUs.

[Root cause]
For SapphireRapids profile :
1. on 8.0.0-1ubuntu7.10 (Jammy) --> It doesn't match x86_SapphireRapids-noTSX.xml because of the missing feature "taa-no" (caused by TSX being off in the kernel, like "hle" and "rtm" features)

2. from 10.0.0-2ubuntu8.6 (Noble), newer and even from upstream --> there isn't any "-noTSX.xml" profile variant available.
   If copying the profile from x86_SapphireRapids.xml without hle, rtm and taa-no, the CPU gets recognised without issue by libvirt.

3. TSX is actually disabled by default on Ubuntu kernels, by enabling with "tsx=on" or "tsx=auto" in the kernel boot command, it allowed libvirt to recognise hle, rtm and taa-no features.

[Potential improvements]
1. Decide what to do with TSX
 a. it is currently disabled by default on Ubuntu's kernels and not even set to auto
 It can be checked quickly by looking at the config of Ubuntu kernel like below :
  $ grep TSX /boot/config-6.8.0-57-generic
    CONFIG_X86_INTEL_TSX_MODE_OFF=y
    # CONFIG_X86_INTEL_TSX_MODE_ON is not set
    # CONFIG_X86_INTEL_TSX_MODE_AUTO is not set

 b. Ubuntu libvirt's packages from Noble 10.0.0-2ubuntu8.6, newer and even upstream don't include any noTSX profile for Rapids CPUs
 --> meaning that even if we retrieve the current cpu_maps from upstream, the Sapphire/Emerald/Granite Rapids CPUs will never be recognised properly by libvirt as of now and in the future in Ubuntu.

2. Add dedicated noTSX profiles for Sapphire Rapids and newer on Ubuntu packages and upstream
 a. if noTSX profiles are created for Sapphire Rapids and newer, we should make sure the feature "taa-no" is removed as well since it will not be recognised with tsx=off in Ubuntu's kernels

description: updated
summary: - Support for Emerald Rapids is missing, related to TSX
+ Emerald Rapids cannot be recognised as Sapphire Rapids due to TSX
+ features
summary: - Emerald Rapids cannot be recognised as Sapphire Rapids due to TSX
+ Emerald Rapids cannot be used as Sapphire Rapids on Ubuntu due to TSX
features
description: updated
tags: added: server-triage-discuss
Revision history for this message
Christian Ehrhardt (paelzer) wrote :

Hey hector, I subscribed you as this is all related to the topic you already work on.

John Chittum (jchittum)
tags: removed: server-triage-discuss
Revision history for this message
Hector CAO (hectorcao) wrote :

We need to understand why SapphireRapids-noTSX (we backported to jammy) does not work on Jammy, probably because of the kernel that is being used (HWE 6.8) instead of the generic Jammy 5.15.

description: updated
Hector CAO (hectorcao)
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.