[Lenovo ThinkPad W520] KVM Guests only use one CPU after host wakes up from sleep

Bug #1042612 reported by Michael Cook
26
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
Medium
Unassigned

Bug Description

ThinkPad W520 running U12.04 with the latest updates applied.
VT-d enabled in BIOS
NVIDIA driver disabled (because it simply doesn't boot with VT-d enabled)
KVM Guests of Windows7, RHEL5.3

i) Start KVM Guest and you can see all host CPUs timeslicing and guest is fast and responsive.
ii) Shutdown KVM Guest.
iii) Place Host laptop into sleep mode.
iv) Wake up host and restart KVM Guest
v) KVM Guest CPU graph is at 50%, never fluctuates and behaviour is like single-core. All host CPUs are idling except for one at 100%.

This is quite reproducable. It is identical in performance and KVM Guest behaviour if you disabled VT-d in the BIOS (but have VT enabled).
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
AplayDevices:
 **** List of PLAYBACK Hardware Devices ****
 card 0: PCH [HDA Intel PCH], device 0: CONEXANT Analog [CONEXANT Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
ApportVersion: 2.0.1-0ubuntu12
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: PCH [HDA Intel PCH], device 0: CONEXANT Analog [CONEXANT Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: kvm 2293 F.... pulseaudio
Card0.Amixer.info:
 Card hw:0 'PCH'/'HDA Intel PCH at 0xf3b20000 irq 54'
   Mixer name : 'Conexant CX20590'
   Components : 'HDA:14f1506e,17aa21cf,00100000 HDA:14f12c06,17aa2122,00100000'
   Controls : 20
   Simple ctrls : 10
Card29.Amixer.info:
 Card hw:29 'ThinkPadEC'/'ThinkPad Console Audio Control at EC reg 0x30, fw unknown'
   Mixer name : 'ThinkPad EC (unknown)'
   Components : ''
   Controls : 1
   Simple ctrls : 1
Card29.Amixer.values:
 Simple mixer control 'Console',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [on]
DistroRelease: Ubuntu 12.04
HibernationDevice: RESUME=UUID=ef35a2f8-8839-44d5-ac77-fcc3fd10dfa2
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Release amd64 (20120425)
MachineType: LENOVO 427638U
Package: linux (not installed)
ProcEnviron:
 LANGUAGE=en_CA:en
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_CA.UTF-8
 SHELL=/bin/bash
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-29-generic root=UUID=6ddbc439-93f6-4c59-a307-902d009a430d ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 3.2.0-29.46-generic 3.2.24
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-29-generic N/A
 linux-backports-modules-3.2.0-29-generic N/A
 linux-firmware 1.79
StagingDrivers: mei
Tags: precise running-unity staging
Uname: Linux 3.2.0-29-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip libvirtd lpadmin plugdev sambashare sudo
dmi.bios.date: 03/02/2011
dmi.bios.vendor: LENOVO
dmi.bios.version: 8BET30WW (1.06 )
dmi.board.asset.tag: Not Available
dmi.board.name: 427638U
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr8BET30WW(1.06):bd03/02/2011:svnLENOVO:pn427638U:pvrThinkPadW520:rvnLENOVO:rn427638U:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 427638U
dmi.product.version: ThinkPad W520
dmi.sys.vendor: LENOVO

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1042612

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Michael Cook (michaelcook-mjc) wrote : AcpiTables.txt

apport information

tags: added: apport-collected precise running-unity staging
description: updated
Revision history for this message
Michael Cook (michaelcook-mjc) wrote : AlsaDevices.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : BootDmesg.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : CRDA.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : Card0.Amixer.values.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : Card0.Codecs.codec.0.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : Card0.Codecs.codec.1.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : IwConfig.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : Lspci.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : Lsusb.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : PciMultimedia.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : ProcModules.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : PulseList.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : RfKill.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : UdevDb.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : UdevLog.txt

apport information

Revision history for this message
Michael Cook (michaelcook-mjc) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Re: KVM Guests only use one CPU after host wakes up from sleep

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.6 kernel[0] (Not a kernel in the daily directory) and install both the linux-image and linux-image-extra .deb packages.

Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. Please only remove that one tag and leave the other tags. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.6-rc3-quantal/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
tags: added: kernel-da-key needs-upstream-testing
Revision history for this message
Michael Cook (michaelcook-mjc) wrote :

Not sure how to 'tag' a bug report but I cannot check the upstream kernel. I have to keep this laptop running U12.04 as-is.
kernel-unable-to-test-upstream

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Gerben Welter (gwelter) wrote :

I'm seeing this exact behaviour on my laptop. I run the Xorg-edgers PPA since a couple of months that PPA also provides the LTS-backport kernel (quantal? kernel) so, that kernel is quite recent (3.5.x with backports). If needed I can run the latest 3.6-rc from the mainline kernel PPA.

Do you want me to provide the logs with the backport or latest rc kernel?

Revision history for this message
Michael Cook (michaelcook-mjc) wrote :

It is still happening as of today.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

@Gerben,

It would be great if you could test the latest mainline kernel:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.6-quantal/

Revision history for this message
Martin Dummer (martin-dummer) wrote :

Hi @all,

I also discover exactly this problem, and I do *not* use Ubuntu on my laptop, I use Gentoo Linux, with kernel 3.5.4.
I have this since several kernel versions an several qemu versions, so I assume
- the problem is not ubuntu specific
- maybe there is a dependency to the hardware and/or its ACPI implemetation
- the problem is not very wide-spread, because this thread here is the only one I found with google's help which deals with this effect.

Revision history for this message
Michael Cook (michaelcook-mjc) wrote :

Thanks Martin for the confirmation in 3.5.4. I think the reason the problem exists from a functional perspective is that few people are probably running KVM guests on Thinkpad laptops and those who run KVM guests probably dont put their host computer into suspend/hibernate. Technically, I have no insight yet into why coming out of suspend the KVM architecture sticks to using only one CPU. I did notice trying to automatically pin CPUs for a guest failed with a popup dialog box complaining there was no NUMA support.

Revision history for this message
Martin Dummer (martin-dummer) wrote :

Update:

I have now kernel 3.6.2 and qemu 1.1.2 in use and the problem is gone.

Revision history for this message
Lars Hagström (donoregano) wrote :

I've got this problem on my Dell Latitude E4200 running Precise.

Revision history for this message
Lars Hagström (donoregano) wrote :

Found the bug in redhats bugzilla:
https://bugzilla.redhat.com/show_bug.cgi?id=714271

Any chance of a backport for Precise? It is an LTS release, after all...

Revision history for this message
Michael Cook (michaelcook-mjc) wrote :

Thanks! Promising that others have found/fixed/looked at this problem.

In the interim there seems to be a 'script' which can restore cpusets after resume/suspend. I dont plan to try it, but thought I'd mention it in case Ubuntu-ers are looking for a workaround. https://www.redhat.com/archives/libvir-list/2012-April/msg00777.html

Revision history for this message
Michael Cook (michaelcook-mjc) wrote :

Any update on when/who can port the solution to Debian/Ubuntu release? The ability to pause and resume guests successfully on all cores would be very useful since I use several KVMs on Thinkpad W520 and often need to suspend/hibernate.

Revision history for this message
penalvch (penalvch) wrote :

Michael Cook, as per http://download.lenovo.com/express/ddfm.html an update is available for your BIOS (1.41). If you update to this, does it change anything?

If not, could you please both specify what happened, and provide the output of the following terminal command:
sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date

Thank you for your understanding.

tags: added: bios-outdated-1.41 regression-potential
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
summary: - KVM Guests only use one CPU after host wakes up from sleep
+ [Lenovo ThinkPad W520] KVM Guests only use one CPU after host wakes up
+ from sleep
tags: added: resume suspend
tags: added: needs-suspend-log
Revision history for this message
Michael Cook (michaelcook-mjc) wrote :

Thanks for the suggestion. I will be able to try this out by end of Nov 2013.

Revision history for this message
Michael Cook (michaelcook-mjc) wrote :

Retesting:
I tested suspending the host but this time with the latest updates to U12.04 applied to the host & guest.

HOST: Linux thinkpad-w520 3.2.0-55-generic #85-Ubuntu SMP Wed Oct 2 12:29:27 UTC 2013 x86_64 x86_64 x86_64
GUEST: Linux tpa 3.2.0-41-generic #66-Ubuntu SMP Thu Apr 25 03:27:11 UTC 2013 x86_64 x86_64 x86_64

1) I stressed out the guest to see which CPUs were in use. I used "stress --cpu 2"
2) I noticed the VM Manager's CPU for this guest go solidly to 100% and CPU 5 & 6 to roughly constant 98%.
3) On resuming the host, I noticed "stress" continued to run but the VM Manager CPU graph had spikes every ~2-3 seconds.
4) Host Sys-monitor showed CPUs for this guest was roughly constant at 94%.

But there was two instead of just one being used by the guest... which is different from my original issue where they all head to just one host CPU. I repeated this suspension several times and the CPU guest spiking did not occur again. So it looks like an improvement.

BIOS Update:
Next, as requested above, I tried to update my W520 BIOS from 1.06 (2011) to 1.42 (latest from Lenovo 2013).

I download the ISO from Lenovo and this booted to "Starting PC DOS" at which point it seems to have hung.
So, I quickly looked into trying to extract and apply the BIOS image in Ubuntu. This does not seem very straightforward either. I've found one guide on ThinkWiki which has step-by-step process (with plenty of warnings) for the "old style" BIOS image .exe to build a new ISO to boot. I extracted the 'new' style using innoextract but I have not found/or understood what do with these files from the ThinkWiki page.

Any help here on updating W520 with the newer 1.42 bios ISO/.exe?

Revision history for this message
Michael Cook (michaelcook-mjc) wrote :

Further to my last comment, some real-world network testing of the applications running on the guests has shown that although the guests resume to use all CPUs the overall performance is sub-standard compared to a clean startup of the host (no suspend).
I don't have any firm numbers to share, just hands-on experience with video streams becoming very unstable after a resume, even with a VM guest reboot. A fresh reboot of the host fixed this substandard behaviour.

Revision history for this message
penalvch (penalvch) wrote :

Michael Cook, regarding the BIOS update, many have found success consulting https://help.ubuntu.com/community/BiosUpdate .

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.