qemu sometimes hangs on shutdown in GRUB tests

Bug #947597 reported by Colin Watson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
qemu-kvm (Ubuntu)
Expired
High
Unassigned

Bug Description

The GRUB test suite uses qemu to test GRUB's scripting facilities: it builds a small ISO9660 image, boots it in qemu, and checks the output. This currently fails for me in precise because qemu sometimes hangs on shutdown. Here's a test image:

  http://people.canonical.com/~cjwatson/tmp/grub-breaks-qemu.iso

Run it like this:

  qemu-system-i386 -nographic -serial file:/dev/stdout -monitor file:/dev/null -hda grub-breaks-qemu.iso -boot c

It should print a load of output and exit; instead, it sometimes (on the order of 10%-50% of the time) prints the same load of output and hangs. The current version of qemu-system-i386 in Debian unstable reliably succeeds. Also, Debian's qemu with Ubuntu's seabios succeeds, and Ubuntu's qemu with Debian's seabios sometimes fails, which I think rules out a problem in seabios.

My system has a 64-bit kernel (3.2.0-17-generic #27) and 32-bit userspace. qemu-system-x86_64 fails intermittently in much the same way. I have an Intel Core 2 Duo, with KVM enabled in the BIOS.

gdb shows the following:

Program received signal SIGINT, Interrupt.
[Switching to Thread 0xf6c6c740 (LWP 25225)]
0xf7fdb430 in __kernel_vsyscall ()
(gdb) thread apply all bt

Thread 8 (Thread 0xda3feb40 (LWP 25237)):
#0 0xf7fdb430 in __kernel_vsyscall ()
#1 0xf79b8d13 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib/i386-linux-gnu/libpthread.so.0
#2 0x5663dd59 in cond_timedwait (ts=0xda3fe2c4, mutex=0x56cc04e4,
    cond=0x56cc0520) at posix-aio-compat.c:104
#3 aio_thread (unused=0x0) at posix-aio-compat.c:334
#4 0xf79b4d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#5 0xf78f376e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

Thread 7 (Thread 0xdabffb40 (LWP 25236)):
#0 0xf7fdb430 in __kernel_vsyscall ()
#1 0xf79b8d13 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib/i386-linux-gnu/libpthread.so.0
#2 0x5663dd59 in cond_timedwait (ts=0xdabff2c4, mutex=0x56cc04e4,
    cond=0x56cc0520) at posix-aio-compat.c:104
#3 aio_thread (unused=0x0) at posix-aio-compat.c:334
#4 0xf79b4d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#5 0xf78f376e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

Thread 6 (Thread 0xdb5ffb40 (LWP 25235)):
#0 0xf7fdb430 in __kernel_vsyscall ()
#1 0xf79b8d13 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib/i386-linux-gnu/libpthread.so.0
#2 0x5663dd59 in cond_timedwait (ts=0xdb5ff2c4, mutex=0x56cc04e4,
    cond=0x56cc0520) at posix-aio-compat.c:104
#3 aio_thread (unused=0x0) at posix-aio-compat.c:334
#4 0xf79b4d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#5 0xf78f376e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

Thread 5 (Thread 0xdbf69b40 (LWP 25234)):
#0 0xf7fdb430 in __kernel_vsyscall ()
#1 0xf79b8d13 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib/i386-linux-gnu/libpthread.so.0
#2 0x5663dd59 in cond_timedwait (ts=0xdbf692c4, mutex=0x56cc04e4,
    cond=0x56cc0520) at posix-aio-compat.c:104
#3 aio_thread (unused=0x0) at posix-aio-compat.c:334
#4 0xf79b4d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#5 0xf78f376e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

Thread 4 (Thread 0xdc76ab40 (LWP 25233)):
#0 0xf7fdb430 in __kernel_vsyscall ()
#1 0xf79b8d13 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib/i386-linux-gnu/libpthread.so.0
#2 0x5663dd59 in cond_timedwait (ts=0xdc76a2c4, mutex=0x56cc04e4,
    cond=0x56cc0520) at posix-aio-compat.c:104
#3 aio_thread (unused=0x0) at posix-aio-compat.c:334
#4 0xf79b4d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#5 0xf78f376e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

Thread 3 (Thread 0xf6369b40 (LWP 25232)):
#0 0xf7fdb430 in __kernel_vsyscall ()
#1 0xf78eb509 in ioctl () at ../sysdeps/unix/syscall-template.S:82
#2 0x566e6241 in kvm_vcpu_ioctl (env=0x56ed2ee8, type=44672)
    at /build/buildd/qemu-kvm-1.0+noroms/kvm-all.c:1101
---Type <return> to continue, or q <return> to quit---
#3 0x566e6332 in kvm_cpu_exec (env=0x56ed2ee8)
    at /build/buildd/qemu-kvm-1.0+noroms/kvm-all.c:987
#4 0x566b688a in qemu_kvm_cpu_thread_fn (arg=0x56ed2ee8)
    at /build/buildd/qemu-kvm-1.0+noroms/cpus.c:740
#5 0xf79b4d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#6 0xf78f376e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

Thread 2 (Thread 0xf6b6ab40 (LWP 25231)):
#0 0xf7fdb430 in __kernel_vsyscall ()
#1 0xf79b8d13 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib/i386-linux-gnu/libpthread.so.0
#2 0x5663dd59 in cond_timedwait (ts=0xf6b6a2c4, mutex=0x56cc04e4,
    cond=0x56cc0520) at posix-aio-compat.c:104
#3 aio_thread (unused=0x0) at posix-aio-compat.c:334
#4 0xf79b4d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#5 0xf78f376e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

Thread 1 (Thread 0xf6c6c740 (LWP 25225)):
#0 0xf7fdb430 in __kernel_vsyscall ()
#1 0xf79b896b in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/i386-linux-gnu/libpthread.so.0
#2 0x56654343 in qemu_cond_wait (cond=0x56cd0e80, mutex=0x56e95920)
    at qemu-thread-posix.c:113
#3 0x566b7dce in pause_all_vcpus ()
    at /build/buildd/qemu-kvm-1.0+noroms/cpus.c:881
#4 0x565859c3 in main (argc=10, argv=0xffffd704, envp=0xffffd730)
    at /build/buildd/qemu-kvm-1.0+noroms/vl.c:3525

Apparently this is not reproducible on all systems; Serge Hallyn wasn't able to reproduce this when I asked him about it on IRC this evening.
---
ApportVersion: 1.94-0ubuntu1
Architecture: i386
DistroRelease: Ubuntu 12.04
EcryptfsInUse: Yes
KvmCmdLine: Error: command ['ps', '-C', 'kvm', '-F'] failed with exit code 1: UID PID PPID C SZ RSS PSR STIME TTY TIME CMD
MachineType: Dell Inc. Latitude D830
Package: qemu-kvm 1.0+noroms-0ubuntu6
PackageArchitecture: i386
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-17-generic root=UUID=45b8cfe4-c971-4514-9e41-08f0592e8bf2 ro bootchart=disable quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 3.2.0-17.27-generic 3.2.6
Tags: precise
Uname: Linux 3.2.0-17-generic x86_64
UpgradeStatus: Upgraded to precise on 2009-12-20 (806 days ago)
UserGroups: adm admin audio cdrom dialout dip floppy fuse games kvm libvirtd lpadmin netdev plugdev powerdev sambashare sbuild scanner video
dmi.bios.date: 06/07/2007
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A02
dmi.board.name: 0HN341
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA02:bd06/07/2007:svnDellInc.:pnLatitudeD830:pvr:rvnDellInc.:rn0HN341:rvr:cvnDellInc.:ct8:cvr:
dmi.product.name: Latitude D830
dmi.sys.vendor: Dell Inc.

Revision history for this message
Colin Watson (cjwatson) wrote : BootDmesg.txt

apport information

tags: added: apport-collected precise
description: updated
Revision history for this message
Colin Watson (cjwatson) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Colin Watson (cjwatson) wrote : Dependencies.txt

apport information

Revision history for this message
Colin Watson (cjwatson) wrote : Lspci.txt

apport information

Revision history for this message
Colin Watson (cjwatson) wrote : Lsusb.txt

apport information

description: updated
Revision history for this message
Colin Watson (cjwatson) wrote : BootDmesg.txt

apport information

Revision history for this message
Colin Watson (cjwatson) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Colin Watson (cjwatson) wrote : Dependencies.txt

apport information

Revision history for this message
Colin Watson (cjwatson) wrote : Lspci.txt

apport information

Revision history for this message
Colin Watson (cjwatson) wrote : Lsusb.txt

apport information

Revision history for this message
Colin Watson (cjwatson) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Colin Watson (cjwatson) wrote : ProcEnviron.txt

apport information

Revision history for this message
Colin Watson (cjwatson) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Colin Watson (cjwatson) wrote : ProcModules.txt

apport information

Revision history for this message
Colin Watson (cjwatson) wrote : RelatedPackageVersions.txt

apport information

Revision history for this message
Colin Watson (cjwatson) wrote : UdevDb.txt

apport information

Revision history for this message
Colin Watson (cjwatson) wrote : UdevLog.txt

apport information

description: updated
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

thanks for filing this bug. i wasn't able to reproduce this in 100+ tests with the same kernel. i wkill try a machine in a lab tonight, but will be offlkine the rest of the week.

the debian unstable kvm has qemu-1.0.-1 patch. ki couldn't find a change to any files in the stack trace in that patch

Changed in qemu-kvm (Ubuntu):
importance: Undecided → High
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Tried to reproduce this on two completely different machines, with no luck.

Revision history for this message
Colin Watson (cjwatson) wrote :

A workaround appears to be to use -no-kvm, so I'm going to go ahead with that in grub2 for now. As such this means there's no rush: it can certainly wait until you're back next week.

Would it help if you had remote access to my laptop? (This will be easier if you have IPv6.)

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 947597] Re: qemu sometimes hangs on shutdown in GRUB tests

Quoting Colin Watson (<email address hidden>):
> A workaround appears to be to use -no-kvm, so I'm going to go ahead with
> that in grub2 for now. As such this means there's no rush: it can
> certainly wait until you're back next week.
>
> Would it help if you had remote access to my laptop?

Yes, that would help.

Just to be sure, have you tried debootstrapping a new precise rootfs
and chrooting to it to run the test? I'm wondering whether you might
have some leftover toolchain changes from some other work...

> (This will be easier if you have IPv6.)

I can try setting up ipv6... So long as it doesn't require any changes
to infrastructure, as I'm at a location (for a few weeks) where I don't
control the infrastructure.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hi Colin,

is it still possible for me to get access to your laptop?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Colin,

do you still see this happen? Did you ever reproduce it in a clean chroot?

Changed in qemu-kvm (Ubuntu):
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for qemu-kvm (Ubuntu) because there has been no activity for 60 days.]

Changed in qemu-kvm (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Colin Watson (cjwatson) wrote :

Sorry for failing to respond. I can no longer reproduce this with my original test image, so I'll assume that it's been fixed somewhere along the way. I'll drop the patch from GRUB and see if the buildds concur.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.