Windows 2008R2 very slow cold boot when >4GB memory

Bug #992067 reported by Matthew Anderson on 2012-04-30
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Undecided
Unassigned

Bug Description

I've been having a consistent problem booting 2008R2 guests with 4096MB of RAM or greater. On the initial boot the KVM process starts out with a ~200MB memory allocation and will use 100% of all CPU allocated to it. The RES memory of the KVM process slowly rises by around 200mb every few minutes until it reaches it's memory allocation (several hours in some cases). Whilst this is happening the guest will usually blue screen with the message of -

A clock interrupt was not received on a secondary processor within the allocated time interval

If I let the KVM process continue to run it will eventually allocate the required memory the guest will run at full speed, usually restarting after the blue screen and booting into startup repair. From here you can restart it and it will boot perfectly. Once booted the guest has no performance issues at all.

I've tried everything I could think of. Removing PAE, playing with huge pages, different kernels, different userspaces, different systems, different backing file systems, different processor feature set, with or without Virtio etc. My best theory is that the problem is caused by Windows 2008 zeroing out all the memory on boot and something is causing this to be held up or slowed to a crawl. The hosts always have memory free to boot the guest and are not using swap at all.

Nothing so far has solved the issue. A few observations I've made about the issue are -
Large memory 2008R2 guests seem to boot fine (or with a small delay) when they are the first to boot on the host after a reboot
Sometimes dropping the disk cache (echo 1 > /proc/sys/vm/drop_caches) will cause them to boot faster

The hosts I've tried are -
All Nehalem based (5540, 5620 and 5660)
Host ram of 48GB, 96GB and 192GB
Storage on NFS, Gluster and local (ext4, xfs and zfs)
QED, QCOW and RAW formats
Scientific Linux 6.1 with the standard kernel 2.6.32, 2.6.38 and 3.3.1
KVM userspaces 0.12, 0.14 and (currently) 0.15.1

This should be resolved by using Hyper-V relaxed timers which is in the latest development version of QEMU. You would need to add -cpu host,+hv_relaxed to the command line to verify this.

Changed in qemu:
status: New → Fix Committed
Matthew Anderson (matthewa) wrote :

Thanks for the quick reply,

I pulled the latest version from Git and on first attempt it said the hv_relaxed feature was not present. I checked the source and the 'hv_relaxed' feature was not included in a 'feature_name' array so the flag was being discarded before it could be enabled.

Once added in to the 'feature_name' array it was enabled but the VM crashes on boot with a blue screen and the error message "Phase0_exception" followed by a reboot.

Thomas Huth (th-huth) wrote :

Triaging old bug tickets... QEMU 0.12/0.14/0.15 is pretty outdated nowadays. Can you still reproduce this behavior with the latest version of QEMU? If not, I think we should close this bug...

Thomas Huth (th-huth) on 2017-07-21
Changed in qemu:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers