SMT not supported by QEMU on AMD Ryzen CPU

Bug #1703506 reported by A S on 2017-07-11
44
This bug affects 7 people
Affects Status Importance Assigned to Milestone
QEMU
Undecided
Eduardo Habkost

Bug Description

HyperThreading/SMT is supported by AMD Ryzen CPUs but results in this message when setting the topology to threads=2:

qemu-system-x86_64: AMD CPU doesn't support hyperthreading. Please configure -smp options properly.

Checking in a Windows 10 guest reveals that SMT is not enabled, and from what I understand, QEMU converts the topology from threads to cores internally on AMD CPUs. This appears to cause performance problems in the guest perhaps because programs are assuming that these threads are actual cores.

Software: Linux 4.12, qemu 2.9.0 host with KVM enabled, Windows 10 pro guest

Databay (rs-databay) wrote :

I can confirm this problem as it affects me, too on Ubuntu XENIAL, Kernel 4.10.0-26-generic

Eduardo Habkost (ehabkost) wrote :

The warning doesn't make QEMU disable anything, it just warns the user that guests are likely to ignore the HT info on CPUID if CPU vendor is AMD.

Please confirm what's the QEMU command-line being used (especially the -smp and -cpu options), and check if the bug persists if using "-cpu host".

To help find out what's wrong, I'd like to see /proc/cpuinfo, "lscpu -e" output and "x86info -v -a" output from both the host system and the guest system.

Changed in qemu:
assignee: nobody → Eduardo Habkost (ehabkost)
A S (scix) wrote :

>Please confirm what's the QEMU command-line being used (especially the -smp and -cpu options), and check if the bug persists if using "-cpu host".

I'm using -cpu host already, here's just the cpu and smp commands:

-cpu host,hv_vendor_id=whatever,kvm=off,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time,smep=off
-smp 12,sockets=1,cores=6,threads=2

The extra commands are just for VGA passthrough, but the problem still occurs with just -cpu host (plus smep=off, problems with booting with it enabled) and the above smp setting.

I've attached host output; I'm using a Windows guest and running msinfo32 indicates:

AMD Ryzen 1600 Six-Core Processor, 3693 Mhz, 12 Core(s), 12 Logical Processors(s)

Suggesting that the guest is seeing the host as 12 cores, 1 thread each, rather than 6 cores, 2 threads each.

If Linux guest information would be more helpful, I'll set up a Linux guest as well.

Imatimba (imatimba) wrote :

I can confirm the same behavior with a Ryzen 7 1700.
Host Arch Linux x64 Kernel 4.11.9, Guest Windows 10 Pro.
Running with -cpu host and -smp 8,sockets=1,cores=4,threads=2.
Attached the logs of the host and results of the output of msinfo32 and "WMIC CPU Get NumberOfCores,NumberOfLogicalProcessors /Format:List" in the guest.
Same results as scix, in my case 8 Core(s), 8 Logical Processors(s).

This seems relevant: https://bugzilla.redhat.com/show_bug.cgi?id=1135772
And a few extra reports on reddit:
https://www.reddit.com/r/VFIO/comments/6nuhb5/big_problem_with_my_ryzen_1700x/
https://www.reddit.com/r/VFIO/comments/6m6kry/smthyperthreading_support_with_ryzen_cpu_and/

I'll test with a Linux guest later.

Eduardo Habkost (ehabkost) wrote :

Thank you for the info. Having info on Linux guests' behavior would be nice to have, but it's possible to extract the raw CPUID data seen by the Windows guest using an equivalent Windows tool (suggestions of tools are welcome).

Also, can somebody confirm if the same Windows version works as expected on bare metal?

Imatimba (imatimba) wrote :

Attached Ubuntu 17.04 guest logs.
I wasn't able to run x86info as root. Only as regular user.
Error shown:
readEntry: Operation not permitted
error reading 1KB from 0x3fffc00

There are a few bug reports about it but no workarounds. Seems to happen on vm's.
So the output is missing a few sections.

>Also, can somebody confirm if the same Windows version works as expected on bare metal?

Yes, same Windows version on bare metal works as expected. In my case showing 8 cores and 16 threads/logical processors.
I'm trying to use 4 cores 8 threads in the VMs. Both Windows and Ubuntu are showing 8 physical cores.

Imatimba (imatimba) wrote :

I tried disabling the CmpLegacy bit directly on /target/i386/cpu.c deleting the If statement on "case 0x80000001:" or changing "*ecx |= 1 << 1;" to "*ecx |= 0 << 1;"
But it didn't work, the VM still sees 8 physical cores.
I believe the HTT bit should be enabled by default
I tried changing it to "*edx |= 1 << 28;" in the If statement of "case 1:" just in case but it didn't matter.
Anything else that I could try to hard-code for testing?

Eduardo Habkost (ehabkost) wrote :
Download full text (4.8 KiB)

I am looking at the diff between the host and guest CPUID, and we have a few candidates: CPUID[4] is all zeroes on the host, and the host has CPUID leaves up to 0x8000001f available, including CPUID[0x8000001d] (which contains cache topology information). Probably we need to implement CPUID[0x8000001d], I will take a look at Linux code to find out if that's all we need.

Full CPUID diff below:

--- /tmp/host-x86info.txt 2017-07-25 15:01:26.753304233 -0300
+++ /tmp/guest-x86info.txt 2017-07-25 15:01:33.563335744 -0300
@@ -1,29 +1,29 @@
 eax in: 0x00000000, eax = 0000000d ebx = 68747541 ecx = 444d4163 edx = 69746e65
-eax in: 0x00000001, eax = 00800f11 ebx = 00100800 ecx = 7ed8320b edx = 178bfbff
-eax in: 0x00000002, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
+eax in: 0x00000001, eax = 00800f11 ebx = 00080800 ecx = ffd83203 edx = 178bfbff
+eax in: 0x00000002, eax = 00000001 ebx = 00000000 ecx = 0000004d edx = 002c307d
 eax in: 0x00000003, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
-eax in: 0x00000004, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
-eax in: 0x00000005, eax = 00000040 ebx = 00000040 ecx = 00000003 edx = 00000000
-eax in: 0x00000006, eax = 00000004 ebx = 00000000 ecx = 00000001 edx = 00000000
-eax in: 0x00000007, eax = 00000000 ebx = 209c01a9 ecx = 00000000 edx = 00000000
+eax in: 0x00000004, eax = 0c000121 ebx = 01c0003f ecx = 0000003f edx = 00000001
+eax in: 0x00000005, eax = 00000000 ebx = 00000000 ecx = 00000003 edx = 00000000
+eax in: 0x00000006, eax = 00000004 ebx = 00000000 ecx = 00000000 edx = 00000000
+eax in: 0x00000007, eax = 00000000 ebx = 209c01ab ecx = 00000000 edx = 00000000
 eax in: 0x00000008, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
 eax in: 0x00000009, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
 eax in: 0x0000000a, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
-eax in: 0x0000000b, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
+eax in: 0x0000000b, eax = 00000001 ebx = 00000002 ecx = 00000100 edx = 00000000
 eax in: 0x0000000c, eax = 00000000 ebx = 00000000 ecx = 00000000 edx = 00000000
 eax in: 0x0000000d, eax = 00000007 ebx = 00000340 ecx = 00000340 edx = 00000000

-eax in: 0x80000000, eax = 8000001f ebx = 68747541 ecx = 444d4163 edx = 69746e65
-eax in: 0x80000001, eax = 00800f11 ebx = 20000000 ecx = 35c233ff edx = 2fd3fbff
+eax in: 0x80000000, eax = 8000001a ebx = 68747541 ecx = 444d4163 edx = 69746e65
+eax in: 0x80000001, eax = 00800f11 ebx = 00000000 ecx = 000003f3 edx = 2fd3fbff
 eax in: 0x80000002, eax = 20444d41 ebx = 657a7952 ecx = 2037206e edx = 30303731
 eax in: 0x80000003, eax = 67694520 ebx = 432d7468 ecx = 2065726f edx = 636f7250
 eax in: 0x80000004, eax = 6f737365 ebx = 20202072 ecx = 20202020 edx = 00202020
-eax in: 0x80000005, eax = ff40ff40 ebx = ff40ff40 ecx = 20080140 edx = 40040140
-eax in: 0x80000006, eax = 26006400 ebx = 66006400 ecx = 02006140 edx = 00808140
-eax in: 0x80000007, eax = 00000000 ebx = 0000001b ecx = 00000000 edx = 00006599
-eax in: 0x80000008, eax = 00003030 ebx = 00000007 ecx = 0000400f edx = 00000000
+eax in: 0x80000005, eax = 01ff01ff ebx = 01ff01ff ec...

Read more...

Eduardo Habkost (ehabkost) wrote :

The core topology info used by Linux (see linux/arch/x86/kernel/cpu/amd.c:amd_get_topology()) is actually at CPUID[0x8000001e].

AMD's documentation is a bit confusing, as the Architecture Programmer's Manual still refers to CPUID[0x8000001e].EBX[bits 7:0] as "compute unit ID", but the Processor Programming Reference for AMD Family 17h documents the same bits as "Core ID". We can implement CPUID[0x8000001e] and print the existing warning only if CPU vendor is AMD and cpuid_family != 0x17.

Babu Moger (babumoger) wrote :

Posted few patches to support this feature on AMD EPYC processors. Feel free to test and review.
1. Kernel kvm patch
   https://patchwork.kernel.org/patch/10190107/
2. qemu patches
   https://patchwork.kernel.org/project/qemu-devel/list/?submitter=178527
Thanks

Babu Moger (babumoger) wrote :

just to be clear.. The kernel kvm patch is rebased on linux-next. If you are on older kernel then try this kernel patch. https://patchwork.kernel.org/patch/10031775/ plus qemu patch.

pseudoterminal X0 (ptx0) wrote :

I tried the above patches on a TR 1900X and they do not help to enable SMT. I still receive the warning.

Also, with the patches applied, the CPU now identifies as EPYC in the guest.

asd fghjkl (ryzen27) wrote :

Error I see in terminal:
AMD CPU doesn't support hyperthreading. Please configure -smp options properly.

Error I see in my windows 10 vm:
SYSTEM THREAD EXCEPTION NOT HANDLED

I am unable to use Qemu at all. Serious problem.

CPU: AMD Ryzen 5 1600X Six-Core Processor × 6

Eduardo Habkost (ehabkost) wrote :

QEMU 3.0 has limited TOPOEXT support. You can try using `-cpu EPYC`, and the `threads` option is supposed to work.

asd fghjkl (ryzen27) wrote :

I got it to work:
sudo nano /etc/modprobe.d/kvm.conf
add "options kvm ignore_msrs=1" (without quotes)
reboot

Then changing "-machine q35" to "-machine pc" kept it from crashing randomly.

Eduardo Habkost (ehabkost) wrote :

@ryzen27: do you have dmesg logs showing the MSRs being written by the guest? You may be hitting the bug described at https://bugzilla.redhat.com/show_bug.cgi?id=1592276

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.