SMT not supported by QEMU on AMD Ryzen CPU

Bug #1703506 reported by A S
46
This bug affects 8 people
Affects Status Importance Assigned to Milestone
QEMU
Expired
Undecided
Unassigned

Bug Description

HyperThreading/SMT is supported by AMD Ryzen CPUs but results in this message when setting the topology to threads=2:

qemu-system-x86_64: AMD CPU doesn't support hyperthreading. Please configure -smp options properly.

Checking in a Windows 10 guest reveals that SMT is not enabled, and from what I understand, QEMU converts the topology from threads to cores internally on AMD CPUs. This appears to cause performance problems in the guest perhaps because programs are assuming that these threads are actual cores.

Software: Linux 4.12, qemu 2.9.0 host with KVM enabled, Windows 10 pro guest

Revision history for this message
Databay (rs-databay) wrote :

I can confirm this problem as it affects me, too on Ubuntu XENIAL, Kernel 4.10.0-26-generic

Revision history for this message
Eduardo Habkost (ehabkost) wrote :

The warning doesn't make QEMU disable anything, it just warns the user that guests are likely to ignore the HT info on CPUID if CPU vendor is AMD.

Please confirm what's the QEMU command-line being used (especially the -smp and -cpu options), and check if the bug persists if using "-cpu host".

To help find out what's wrong, I'd like to see /proc/cpuinfo, "lscpu -e" output and "x86info -v -a" output from both the host system and the guest system.

Changed in qemu:
assignee: nobody → Eduardo Habkost (ehabkost)
Revision history for this message
A S (scix) wrote :

>Please confirm what's the QEMU command-line being used (especially the -smp and -cpu options), and check if the bug persists if using "-cpu host".

I'm using -cpu host already, here's just the cpu and smp commands:

-cpu host,hv_vendor_id=whatever,kvm=off,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time,smep=off
-smp 12,sockets=1,cores=6,threads=2

The extra commands are just for VGA passthrough, but the problem still occurs with just -cpu host (plus smep=off, problems with booting with it enabled) and the above smp setting.

I've attached host output; I'm using a Windows guest and running msinfo32 indicates:

AMD Ryzen 1600 Six-Core Processor, 3693 Mhz, 12 Core(s), 12 Logical Processors(s)

Suggesting that the guest is seeing the host as 12 cores, 1 thread each, rather than 6 cores, 2 threads each.

If Linux guest information would be more helpful, I'll set up a Linux guest as well.

Revision history for this message
Imatimba (imatimba) wrote :

I can confirm the same behavior with a Ryzen 7 1700.
Host Arch Linux x64 Kernel 4.11.9, Guest Windows 10 Pro.
Running with -cpu host and -smp 8,sockets=1,cores=4,threads=2.
Attached the logs of the host and results of the output of msinfo32 and "WMIC CPU Get NumberOfCores,NumberOfLogicalProcessors /Format:List" in the guest.
Same results as scix, in my case 8 Core(s), 8 Logical Processors(s).

This seems relevant: https://bugzilla.redhat.com/show_bug.cgi?id=1135772
And a few extra reports on reddit:
https://www.reddit.com/r/VFIO/comments/6nuhb5/big_problem_with_my_ryzen_1700x/
https://www.reddit.com/r/VFIO/comments/6m6kry/smthyperthreading_support_with_ryzen_cpu_and/

I'll test with a Linux guest later.

Revision history for this message
Eduardo Habkost (ehabkost) wrote :

Thank you for the info. Having info on Linux guests' behavior would be nice to have, but it's possible to extract the raw CPUID data seen by the Windows guest using an equivalent Windows tool (suggestions of tools are welcome).

Also, can somebody confirm if the same Windows version works as expected on bare metal?

Revision history for this message
Imatimba (imatimba) wrote :

Attached Ubuntu 17.04 guest logs.
I wasn't able to run x86info as root. Only as regular user.
Error shown:
readEntry: Operation not permitted
error reading 1KB from 0x3fffc00

There are a few bug reports about it but no workarounds. Seems to happen on vm's.
So the output is missing a few sections.

>Also, can somebody confirm if the same Windows version works as expected on bare metal?

Yes, same Windows version on bare metal works as expected. In my case showing 8 cores and 16 threads/logical processors.
I'm trying to use 4 cores 8 threads in the VMs. Both Windows and Ubuntu are showing 8 physical cores.

Revision history for this message
Imatimba (imatimba) wrote :

I tried disabling the CmpLegacy bit directly on /target/i386/cpu.c deleting the If statement on "case 0x80000001:" or changing "*ecx |= 1 << 1;" to "*ecx |= 0 << 1;"
But it didn't work, the VM still sees 8 physical cores.
I believe the HTT bit should be enabled by default
I tried changing it to "*edx |= 1 << 28;" in the If statement of "case 1:" just in case but it didn't matter.
Anything else that I could try to hard-code for testing?

Revision history for this message
Eduardo Habkost (ehabkost) wrote :
Download full text (4.8 KiB)

I am looking at the diff between the host and guest CPUID, and we have a few candidates: CPUID[4] is all zeroes on the host, and the host has CPUID leaves up to 0x8000001f available, including CPUID[0x8000001d] (which contains cache topology information). Probably we need to implement CPUID[0x8000001d], I will take a look at Linux code to find out if that's all we need.

Full CPUID diff below:

--- /tmp/host-x86info.txt 2017-07-25 15:01:26.753304233 -0300
+++ /tmp/guest-x86info.txt 2017-07-25 15:01:33.563335744 -0300
@@ -1,29 +1,29 @@
 eax in: 0x00000000, eax = 0000000d ebx = 68747541 ecx = 444d4163 edx = 69746e65
-eax in: 0x00000001, eax = 00800f11 ebx = 00100800 ecx = 7ed8320b edx = 178bfbff
-eax in: 0x00000002, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
+eax in: 0x00000001, eax = 00800f11 ebx = 00080800 ecx = ffd83203 edx = 178bfbff
+eax in: 0x00000002, eax = 00000001 ebx = 00000000 ecx = 0000004d edx = 002c307d
 eax in: 0x00000003, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
-eax in: 0x00000004, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
-eax in: 0x00000005, eax = 00000040 ebx = 00000040 ecx = 00000003 edx = 00000000
-eax in: 0x00000006, eax = 00000004 ebx = 00000000 ecx = 00000001 edx = 00000000
-eax in: 0x00000007, eax = 00000000 ebx = 209c01a9 ecx = 00000000 edx = 00000000
+eax in: 0x00000004, eax = 0c000121 ebx = 01c0003f ecx = 0000003f edx = 00000001
+eax in: 0x00000005, eax = 00000000 ebx = 00000000 ecx = 00000003 edx = 00000000
+eax in: 0x00000006, eax = 00000004 ebx = 00000000 ecx = 00000000 edx = 00000000
+eax in: 0x00000007, eax = 00000000 ebx = 209c01ab ecx = 00000000 edx = 00000000
 eax in: 0x00000008, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
 eax in: 0x00000009, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
 eax in: 0x0000000a, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
-eax in: 0x0000000b, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
+eax in: 0x0000000b, eax = 00000001 ebx = 00000002 ecx = 00000100 edx = 00000000
 eax in: 0x0000000c, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
 eax in: 0x0000000d, eax = 00000007 ebx = 00000340 ecx = 00000340 edx = 00000000

-eax in: 0x80000000, eax = 8000001f ebx = 68747541 ecx = 444d4163 edx = 69746e65
-eax in: 0x80000001, eax = 00800f11 ebx = 20000000 ecx = 35c233ff edx = 2fd3fbff
+eax in: 0x80000000, eax = 8000001a ebx = 68747541 ecx = 444d4163 edx = 69746e65
+eax in: 0x80000001, eax = 00800f11 ebx = 00000000 ecx = 000003f3 edx = 2fd3fbff
 eax in: 0x80000002, eax = 20444d41 ebx = 657a7952 ecx = 2037206e edx = 30303731
 eax in: 0x80000003, eax = 67694520 ebx = 432d7468 ecx = 2065726f edx = 636f7250
 eax in: 0x80000004, eax = 6f737365 ebx = 20202072 ecx = 20202020 edx = 00202020
-eax in: 0x80000005, eax = ff40ff40 ebx = ff40ff40 ecx = 20080140 edx = 40040140
-eax in: 0x80000006, eax = 26006400 ebx = 66006400 ecx = 02006140 edx = 00808140
-eax in: 0x80000007, eax = 00000000 ebx = 0000001b ecx = 00000000 edx = 00006599
-eax in: 0x80000008, eax = 00003030 ebx = 00000007 ecx = 0000400f edx = 00000000
+eax in: 0x80000005, eax = 01ff01ff ebx = 01ff01ff ec...

Read more...

Revision history for this message
Eduardo Habkost (ehabkost) wrote :

The core topology info used by Linux (see linux/arch/x86/kernel/cpu/amd.c:amd_get_topology()) is actually at CPUID[0x8000001e].

AMD's documentation is a bit confusing, as the Architecture Programmer's Manual still refers to CPUID[0x8000001e].EBX[bits 7:0] as "compute unit ID", but the Processor Programming Reference for AMD Family 17h documents the same bits as "Core ID". We can implement CPUID[0x8000001e] and print the existing warning only if CPU vendor is AMD and cpuid_family != 0x17.

Revision history for this message
Babu Moger (babumoger) wrote :

Posted few patches to support this feature on AMD EPYC processors. Feel free to test and review.
1. Kernel kvm patch
   https://patchwork.kernel.org/patch/10190107/
2. qemu patches
   https://patchwork.kernel.org/project/qemu-devel/list/?submitter=178527
Thanks

Revision history for this message
Babu Moger (babumoger) wrote :

just to be clear.. The kernel kvm patch is rebased on linux-next. If you are on older kernel then try this kernel patch. https://patchwork.kernel.org/patch/10031775/ plus qemu patch.

Revision history for this message
pseudoterminal X0 (ptx0) wrote :

I tried the above patches on a TR 1900X and they do not help to enable SMT. I still receive the warning.

Also, with the patches applied, the CPU now identifies as EPYC in the guest.

Revision history for this message
asd fghjkl (ryzen27) wrote :

Error I see in terminal:
AMD CPU doesn't support hyperthreading. Please configure -smp options properly.

Error I see in my windows 10 vm:
SYSTEM THREAD EXCEPTION NOT HANDLED

I am unable to use Qemu at all. Serious problem.

CPU: AMD Ryzen 5 1600X Six-Core Processor × 6

Revision history for this message
Eduardo Habkost (ehabkost) wrote :

QEMU 3.0 has limited TOPOEXT support. You can try using `-cpu EPYC`, and the `threads` option is supposed to work.

Revision history for this message
asd fghjkl (ryzen27) wrote :

I got it to work:
sudo nano /etc/modprobe.d/kvm.conf
add "options kvm ignore_msrs=1" (without quotes)
reboot

Then changing "-machine q35" to "-machine pc" kept it from crashing randomly.

Revision history for this message
Eduardo Habkost (ehabkost) wrote :

@ryzen27: do you have dmesg logs showing the MSRs being written by the guest? You may be hitting the bug described at https://bugzilla.redhat.com/show_bug.cgi?id=1592276

Revision history for this message
Thomas Huth (th-huth) wrote :

The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now.
If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience.

Changed in qemu:
status: New → Incomplete
Thomas Huth (th-huth)
Changed in qemu:
assignee: Eduardo Habkost (ehabkost) → nobody
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for QEMU because there has been no activity for 60 days.]

Changed in qemu:
status: Incomplete → Expired
Revision history for this message
Andrii (amnevar) wrote (last edit ):

Found the same problem using Gnome boxes, as I understand it uses QEMU.

Error I see in gnome boxes when I'm trying to install windows 10 vm:
SYSTEM THREAD EXCEPTION NOT HANDLED

Fresh install of Ubuntu 22.04
CPU: AMD Ryzen 7 1700

The solution posted by asd fghjkl (ryzen27) worked for me too:

sudo nano /etc/modprobe.d/kvm.conf
add "options kvm ignore_msrs=1" (without quotes)
reboot

Revision history for this message
Anthony Kamau (ak-launchpad) wrote (last edit ):

I was able to avoid rebooting after following @Andrii's instructions - https://bugs.launchpad.net/qemu/+bug/1703506/comments/19 - above:

systemctl stop libvirtd libvirtd-admin.socket libvirtd-ro.socket libvirtd.socket
sudo modprobe -r kvm_intel kvm
systemctl start libvirtd libvirtd-admin.socket libvirtd-ro.socket libvirtd.socket

These instructions to avoid rebooting might not work for those using a non-Intel CPU as you'll have a different kernel module. You can check by running `lsmod | grep kvm`.

Cheers,
ak.

System info:
# inxi -CMz
Machine:
  Type: Laptop System: Dell product: Precision M6700 v: 01 serial: <filter>
  Mobo: Dell model: 0JWMFY v: A00 serial: <filter> UEFI: Dell v: A20 date: 11/30/2018
CPU:
  Info: quad core model: Intel Core i7-3840QM bits: 64 type: MT MCP cache: L2: 1024 KiB
  Speed (MHz): avg: 3607 min/max: 1200/3800 cores: 1: 3588 2: 3615 3: 3638 4: 3588 5: 3588
    6: 3638 7: 3617 8: 3588

Revision history for this message
Nelo (nelo390) wrote :

This affected me. Took me several days.

The solution posted by asd fghjkl (ryzen27) worked for me as well:

sudo nano /etc/modprobe.d/kvm.conf
options kvm ignore_msrs=1
and then rebooted

I'm very glad i found this thread. Don't know where to report this or if it's even a bug, But hope it gets fixed!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.