kvm brings Oneiric host to a grinding halt

Bug #973536 reported by Juerg Haefliger
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned
qemu-kvm (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

I'm running a Windows7 guest in KVM via libvirt/virt-manager on a Oneiric desktop 64-bit. Every once in a while (2-3 times a week, sometimes daily) the CPU usage goes berserk and the host becomes very unresponsive and unusable. This also happened once with a Breezy guest so it's not strictly a Windows issue. I'm just seeing it more often with Windows because that's what I run daily.

More information attached, let me know what other info I should collect, the next time the machine is in that state.
---
ApportVersion: 1.23-0ubuntu4
Architecture: amd64
DistroRelease: Ubuntu 11.10
InstallationMedia: Ubuntu 11.10 "Oneiric Ocelot" - Release amd64 (20111012)
KvmCmdLine:
 UID PID PPID C SZ RSS PSR STIME TTY TIME CMD
 116 2892 1 13 327807 1062024 0 07:34 ? 00:49:31 /usr/bin/kvm -S -M pc-0.14 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name Windows7 -uuid 63a736b8-2d14-4354-8974-83c3f0d71ea1 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/Windows7.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=readline -rtc base=localtime -boot c -drive file=/home/juergh/VirtualMachines/Windows7.img,if=none,id=drive-virtio-disk0,boot=on,format=raw,cache=none,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,fd=19,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:56:6f:49,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -usb -vnc 127.0.0.1:0 -vga std -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
MachineType: Hewlett-Packard HP EliteBook 8530w
NonfreeKernelModules: nvidia
Package: qemu-kvm 0.14.1+noroms-0ubuntu6.2
PackageArchitecture: amd64
PccardctlIdent:
 Socket 0:
   product info: "RICOH", "Bay8Controller", "", ""
   manfid: 0x0000, 0x0000
   function: 254 (unknown)
PccardctlStatus:
 Socket 0:
   3.3V 16-bit PC Card
   Subdevice 0 (function 0) bound to driver "pata_pcmcia"
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.0.0-17-generic root=UUID=076a3371-4882-4e13-a456-9855c6fb4caf ro
ProcVersionSignature: Ubuntu 3.0.0-17.30-generic 3.0.22
Tags: oneiric running-unity
Uname: Linux 3.0.0-17-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm admin cdrom dialout libvirtd lpadmin mock plugdev sambashare
dmi.bios.date: 12/08/2009
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: 68PDV Ver. F.11
dmi.board.name: 30E7
dmi.board.vendor: Hewlett-Packard
dmi.board.version: KBC Version 90.26
dmi.chassis.type: 10
dmi.chassis.vendor: Hewlett-Packard
dmi.modalias: dmi:bvnHewlett-Packard:bvr68PDVVer.F.11:bd12/08/2009:svnHewlett-Packard:pnHPEliteBook8530w:pvrF.11:rvnHewlett-Packard:rn30E7:rvrKBCVersion90.26:cvnHewlett-Packard:ct10:cvr:
dmi.product.name: HP EliteBook 8530w
dmi.product.version: F.11
dmi.sys.vendor: Hewlett-Packard

Revision history for this message
Juerg Haefliger (juergh) wrote :
Revision history for this message
Juerg Haefliger (juergh) wrote :
Revision history for this message
Juerg Haefliger (juergh) wrote :
Revision history for this message
Juerg Haefliger (juergh) wrote :
Revision history for this message
Juerg Haefliger (juergh) wrote :
Revision history for this message
Juerg Haefliger (juergh) wrote :
Revision history for this message
Juerg Haefliger (juergh) wrote :
Revision history for this message
Juerg Haefliger (juergh) wrote :
Revision history for this message
Juerg Haefliger (juergh) wrote :
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks for reporting this bug.

After this happens, do you have to reset the system, or does it get better after awhile?

Is it kvm itself, or libvirt, which is taking the cpu time?

Do you usually have only one vm up when this is happening?

Is there anything interesting in syslog?

Changed in qemu-kvm (Ubuntu):
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Please also run 'apport-collect 973536'.

Revision history for this message
Juerg Haefliger (juergh) wrote :

Once in this state, the system doesn't recover by itself. When I kill kvm, things seem to go back to normal but if I start it up again the CPU usage of the same couple of processes goes up again, almost immeditately. Only a reboot fixes it so that it stays stable for a longer amount of time.

Only one VM at a time.

Syslog is clean.

I'll run apport-collect next time I hit it.

Revision history for this message
Juerg Haefliger (juergh) wrote : BootDmesg.txt

apport information

tags: added: apport-collected oneiric running-unity
description: updated
Revision history for this message
Juerg Haefliger (juergh) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Juerg Haefliger (juergh) wrote : Dependencies.txt

apport information

Revision history for this message
Juerg Haefliger (juergh) wrote : Lspci.txt

apport information

Revision history for this message
Juerg Haefliger (juergh) wrote : Lsusb.txt

apport information

Revision history for this message
Juerg Haefliger (juergh) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Juerg Haefliger (juergh) wrote : ProcEnviron.txt

apport information

Revision history for this message
Juerg Haefliger (juergh) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Juerg Haefliger (juergh) wrote : ProcModules.txt

apport information

Revision history for this message
Juerg Haefliger (juergh) wrote : RelatedPackageVersions.txt

apport information

Revision history for this message
Juerg Haefliger (juergh) wrote : UdevDb.txt

apport information

Revision history for this message
Juerg Haefliger (juergh) wrote : UdevLog.txt

apport information

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks for the info.

dmesg shows

[22149.060532] EXT4-fs (md0): Unaligned AIO/DIO on inode 13632016 by kvm; performance will be poor.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Marked this as affecting linux in the hopes that kernel folks might know what

[22149.060532] EXT4-fs (md0): Unaligned AIO/DIO on inode 13632016 by kvm; performance will be poor.

means

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Note,

I believe as a workaround (assuming the error message explains your problem) you should be able to run xp from a virtual drive that is not on a raid disk. Assuming you have such.

Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Juerg Haefliger (juergh) wrote :

Good catch. I missed the 'Unaligned AIO/DIO' message. However, why would this cause other processes to start hogging the CPU? Also, I created a new Windows image and I still get the 'Unaligned AIO/DIO' message but the system does not get into the same state where multiple processes will comsume the CPU.

Changed in qemu-kvm (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

I'm really not certain, but I would guess other processes would start hogging the cpu because their writes are being serialized with bulk writes by the guest?

It seems some filesystems (XFS, ext3?) should be immune to this, so if you have a spare partition then you could try using a different underlying filesystem for the guest backing file. An LVM would also work.

penalvch (penalvch)
tags: added: bios-outdated-f.20 needs-upstream-testing
Revision history for this message
penalvch (penalvch) wrote :

Juerg Haefliger, this bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? If so, could you please test for this with the latest development release of Ubuntu? ISO images are available from http://cdimage.ubuntu.com/daily-live/current/ .

If it remains an issue, could you please run the following command in the development release from a Terminal (Applications->Accessories->Terminal), as it will automatically gather and attach updated debug information to this report:

apport-collect -p linux <replace-with-bug-number>

Also, could you please test the latest upstream kernel available (not the daily folder) following https://wiki.ubuntu.com/KernelMainlineBuilds ? It will allow additional upstream developers to examine the issue. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this bug is fixed in the mainline kernel, please add the following tags:
kernel-fixed-upstream
kernel-fixed-upstream-VERSION-NUMBER

where VERSION-NUMBER is the version number of the kernel you tested. For example:
kernel-fixed-upstream-v3.13-rc3

This can be done by clicking on the yellow circle with a black pencil icon next to the word Tags located at the bottom of the bug description. As well, please remove the tag:
needs-upstream-testing

If the mainline kernel does not fix this bug, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-VERSION-NUMBER

As well, please remove the tag:
needs-upstream-testing

Once testing of the upstream kernel is complete, please mark this bug's Status as Confirmed. Please let us know your results. Thank you for your understanding.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Juerg Haefliger (juergh)
Changed in linux (Ubuntu):
status: Incomplete → Invalid
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.