Windows guest hangs after reboot from the guest OS

Bug #2064914 reported by Björn Hinz
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
qemu (Fedora)
Unknown
Unknown
qemu (Ubuntu)
Incomplete
Undecided
Unassigned
Jammy
Incomplete
Undecided
Sergio Durigan Junior

Bug Description

[ Impact ]

Some versions of Windows hang on reboot if their TSC value is greater
than 2^54. The calibration of the Hyper-V reference time overflows
and fails; as a result the processors' clock sources are out of sync.

[ Test Plan ]

TBD.

[ Where problems could occur ]

TBD.

[ Original Description ]

Description:
Some versions of Windows hang on reboot if their TSC value is greater
than 2^54. The calibration of the Hyper-V reference time overflows
and fails; as a result the processors' clock sources are out of sync.

The issue is that the TSC _should_ be reset to 0 on CPU reset and
QEMU tries to do that. However, KVM special cases writing 0 to the
TSC and thinks that QEMU is trying to hot-plug a CPU, which is
correct the first time through but not later. Thwart this valiant
effort and reset the TSC to 1 instead, but only if the CPU has been
run once.

For this to work, env->tsc has to be moved to the part of CPUArchState
that is not zeroed at the beginning of x86_cpu_reset.

Solution: [PATCH] target/i386: properly reset TSC on reset

I created and tested a ppa ubuntu package already. The patch fixes this issue.
Link to ppa: https://launchpad.net/~bhinz83/+archive/ubuntu/openstack-rds/+packages

It affects only jammy 22.04 package. The newest version is: qemu-1:6.2+dfsg-2ubuntu6.19

Tags: jammy patch

Related branches

Revision history for this message
Björn Hinz (bhinz83) wrote :
description: updated
description: updated
Björn Hinz (bhinz83)
description: updated
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "Patch imported from RHEL 8" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Revision history for this message
Björn Hinz (bhinz83) wrote :
Paride Legovini (paride)
Changed in qemu (Ubuntu Jammy):
status: New → Triaged
Changed in qemu (Ubuntu):
status: New → Incomplete
Revision history for this message
Paride Legovini (paride) wrote :

Hello and thanks for this bug report, for attaching the patch and for the PPA.

I know you wrote that this bug "affects only jammy 22.04 package" but let me ask explicitly: is this fixed in the Ubuntu 24.04 LTS? This is important for us know for a process point of view.

Also: making the fix land to Jammy will require verification from an user affected by the bug. This is likely you, given that we are unlikely to have the required Windows version at hand. The process consists in installing the affected package from the -proposed pocket, and to verify it works as expected (more on this process: [1]). Are you willing to help us this verification?

Thanks!

[1] https://wiki.ubuntu.com/StableReleaseUpdates

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Thank you Björn for reporting the bug, and Paride for the initial triage.

As Paride said, we have to make sure this is not affecting other versions of QEMU as well. If the patch mentioned in the description is indeed the only one needed to fix the bug, then I think we're good:

$ git tag --contains 5286c3662294119dc2dd1e9296757337211451f6
v7.0.0
v7.0.0-rc2
v7.0.0-rc3
v7.0.0-rc4
v7.1.0
v7.1.0-rc0
v7.1.0-rc1
v7.1.0-rc2
v7.1.0-rc3
v7.1.0-rc4
v7.2.0
v7.2.0-rc0
v7.2.0-rc1
v7.2.0-rc2
v7.2.0-rc3
v7.2.0-rc4
v7.2.1
v7.2.2
v7.2.3
v7.2.4
v7.2.5
v7.2.6
v7.2.7
v7.2.8
v7.2.9
v8.0.0
v8.0.0-rc0
v8.0.0-rc1
v8.0.0-rc2
v8.0.0-rc3
v8.0.0-rc4
v8.0.1
v8.0.2
v8.0.3
v8.0.4
v8.0.5
v8.1.0
v8.1.0-rc0
v8.1.0-rc1
v8.1.0-rc2
v8.1.0-rc3
v8.1.0-rc4
v8.1.1
v8.1.2
v8.1.3
v8.1.4
v8.1.5
v8.2.0
v8.2.0-rc0
v8.2.0-rc1
v8.2.0-rc2
v8.2.0-rc3
v8.2.0-rc4
v8.2.1

We have QEMU 8.0.4 on Mantic and 8.2.2 on Noble.

@Björn, while it would be good to have reproduction steps for the bug, we can also rely on your help to verify the correctness of the fix (as Paride explained).

I'll work on prepare an upload for Jammy meanwhile.

Thanks.

Changed in qemu (Ubuntu Jammy):
assignee: nobody → Sergio Durigan Junior (sergiodj)
description: updated
Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Hi again, Björn,

Could you please give the following PPA a try and tell me if the package works?

https://launchpad.net/~sergiodj/+archive/ubuntu/qemu

The QEMU version there is 1:6.2+dfsg-2ubuntu6.20~ppa1.

Thanks a lot.

Revision history for this message
Björn Hinz (bhinz83) wrote :

Hi,
thanks for your investigation.

@Sergio In rhel bug desciption https://bugzilla.redhat.com/show_bug.cgi?id=2074737 is described a test.
In qemu I found the patch https://github.com/qemu/qemu/commit/5286c3662294119dc2dd1e9296757337211451f6 too.
It seems that this issue patched since qemu 7.0.0

@Paride I saw this issue in our production environment with "Windows Server2012 R2", "Windows Server 2019" and "Windows Server 2022" with all available updates installed. We had this problem for the last years but only found this patch last month.

@Sergio I will try and test your ppa package.

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hi Sergio,

Thanks for the upload!

The bug and fix sound good and relatively simple to verify.

The SRU bug template is still missing Test Plan / Regression Potential, so I'll mark it as Incomplete for the time being (I see it may be coming soon, per "TBD").

While looking at the Björn's mention of the test case in RH BZ, that uses the `rdtsc.flat` kernel, which I couldn't find elsewhere. Do you know about it?
It's apparently a `rdtsc`/print loop we can replicate, if needed.

If I may, I'd suggest to test this in 3 ways:

1) unit test, with such rdtsc/print loop (and confirm the tsc value decreases after system_reset)
2) functional test, booting Windows (e.g., downloaded from MSFT Evaluation Center) and changing TSC manually to a problematic value (> 2^54) before reset, with the QEMU monitor or GDB, if possible?
3) regression test, booting Ubuntu kernel/initrd pairs (installer's should be enough) from supported releases, and checking they boot/reach a prompt.

I realize that looks like too much for a simple fix, but this is QEMU on amd64.
I'd be quite willing to help with that if needed. :)

Thanks again!

Changed in qemu (Ubuntu Jammy):
status: Triaged → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.