KVM guest is always terminated due to unknown reason

Bug #1126539 reported by GONG-YI LIAO
24
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
High
Unassigned

Bug Description

The last 32-bit kernel (at this moment, 3.8.0-6-generic) always terminates the QEMU-KVM guest.

I am runnig a qemu-kvm guest on the 32-bit (CPU:T2300 Centrino duo) laptop, and the guest always crashes (in virsh it shown as "paused") with following log:

------------------------
W: kvm binary is deprecated, please use qemu-system-x86_64 instead
char device redirected to /dev/pts/2
KVM: entry failed, hardware error 0x80000021

If you're running a guest on an Intel machine without unrestricted mode
support, the failure can be most likely due to the guest entering an invalid
state for Intel VT. For example, the guest maybe running in big real mode
which is not supported on less recent Intel processors.

EAX=00000000 EBX=00195e13 ECX=fffff000 EDX=fffff000
ESI=00000000 EDI=00000000 EBP=f71eaf44 ESP=f6c31f90
EIP=c1022487 EFL=00010246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =007b 00000000 ffffffff 00c0f300 DPL=3 DS [-WA]
CS =0060 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
SS =0068 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
DS =007b 00000000 ffffffff 00c0f300 DPL=3 DS [-WA]
FS =00d8 35d5b000 ffffffff 00809300 DPL=0 DS16 [-WA]
GS =00e0 f71ef980 00000018 00409100 DPL=0 DS [--A]
LDT=0000 ffff0000 f0000fff 00f0ff00 DPL=3 CS64 [CRA]
TR =0080 f71ed7c0 0000206b 00008b00 DPL=0 TSS32-busy
GDT= f71e8000 000000ff
IDT= c13ec000 000007ff
CR0=8005003b CR2=ffffffff CR3=0149d000 CR4=000006f0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000700000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000800
Code=ff ff 89 10 c3 8b 15 c8 42 3f c1 8d 84 10 00 c0 ff ff 8b 00 <c3> 8b 15 5c 3c 3f c1 53 89 c3 b8 30 00 00 00 ff 92 9c 00 00 00 3c 13 77 0c a1 e4 3e 42 c1
qemu: terminating on signal 15 from pid 2825
------------------------------------

But, I have exact the same setting on the another laptop with 64-bit CPU running 64-bit Ubuntu Raring, the qemu-kvm guest can run without any problems.

I also tried the old kernel from Quantal ( 3.5.0-23-generic) on the 32-bit laptop, the qemu-kvm quest can also run without any problem.
---
ApportVersion: 2.8-0ubuntu4
Architecture: i386
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/hwC0D1', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/pcmC0D6c', '/dev/snd/pcmC0D6p', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: LinuxMint 14
HibernationDevice: RESUME=UUID=feac92f1-d692-420d-8e50-8490d521728f
InstallationDate: Installed on 2012-12-02 (75 days ago)
InstallationMedia: Linux Mint 14 "Nadia" - Release i386 (20121120)
Lsusb:
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: TOSHIBA Satellite M100
MarkForUpload: True
Package: linux (not installed)
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.8.0-6-generic root=UUID=61f57f78-9e07-4c28-b44d-58f8fe8e73b8 ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 3.8.0-6.13-generic 3.8.0-rc7
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-3.8.0-6-generic N/A
 linux-backports-modules-3.8.0-6-generic N/A
 linux-firmware 1.102
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: yes
Tags: nadia
Uname: Linux 3.8.0-6-generic i686
UnreportableReason: This is not an official LinuxMint package. Please remove any third party package and try again.
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip fuse libvirtd lpadmin plugdev sambashare sudo
dmi.bios.date: 09/20/2006
dmi.bios.vendor: TOSHIBA
dmi.bios.version: V2.20
dmi.board.name: HAQAA
dmi.board.vendor: TOSHIBA
dmi.board.version: Null
dmi.chassis.asset.tag: *
dmi.chassis.type: 10
dmi.chassis.vendor: TOSHIBA
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnTOSHIBA:bvrV2.20:bd09/20/2006:svnTOSHIBA:pnSatelliteM100:pvrPSMA0T-04J01Y:rvnTOSHIBA:rnHAQAA:rvrNull:cvnTOSHIBA:ct10:cvrN/A:
dmi.product.name: Satellite M100
dmi.product.version: PSMA0T-04J01Y
dmi.sys.vendor: TOSHIBA

description: updated
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1126539

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: raring
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I'd like to perform a bisect to figure out what commit caused this regression. It would be very helpful to know the earliest kernel where the issue started happening as well as the latest kernel that did not have this issue.

Can you test the following kernels and report back? We are looking for the first kernel version that exhibits this bug:

v3.6 final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.6-quantal/
v3.7 final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.7-raring/
v3.8-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.8-rc1-raring/
v3.8-rc4: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.8-rc4-raring/

You don't have to test every kernel, just up until the kernel that first has this bug.

Thanks in advance!

tags: added: regression-release
Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: performing-bisect
Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected nadia
description: updated
Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote : BootDmesg.txt

apport information

Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote : CRDA.txt

apport information

Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote : CurrentDmesg.txt

apport information

Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote : IwConfig.txt

apport information

Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote : Lspci.txt

apport information

Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote : ProcInterrupts.txt

apport information

Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote : ProcModules.txt

apport information

Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote : UdevDb.txt

apport information

Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote : UdevLog.txt

apport information

Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote :

I have bi-section tested through the main line kernels, the last working kernel is v3.5.7.5-quantal ( http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.5.7.5-quantal/ ).

All the kernels later than v3.5.7.5-quantal (starting with v3.6-quantal) can reproduce this bug.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Before I start the kernel bisect, can you test the latest mainline kernel[0] to see if this bug is already resolved there?

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.8-raring/

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote :

Yes, I have tried the kernel available at mainline kernel ppa, as you mentioned.
I have done the bi-section search and testing throughout the mainline kernels, the last working kernel is 3.5.7.5-quantal at http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.5.7.5-quantal/

Five minutes ago, I have tried the kernel you mentioned in last message, the bug still reproducible with that kernel.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you test the v3.6-rc1 kernel?
http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.6-rc1-quantal/

We know that v3.5 final is the last good kernel. We also need to know the first bad kernel that is linear to v3.5. If v3.6-rc1 is good, we would want to test v3.6-rc2 and so on.

Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote :

Yes, I have tested almost all of them, I am sure that none of 3.6 and later kernels work.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I started a kernel bisect between v3.5 final and v3.6-rc1. The kernel bisect will require testing of about 7-10 test kernels.

I built the first test kernel, up to the following commit:
614a6d4341b3760ca98a1c2c09141b71db5d1e90

The test kernel can be downloaded from:
http://people.canonical.com/~jsalisbury/lp1126539

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote :

Those kernel debs are for amd64 machines, but the machine I used is an ia32 machine.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Are you sure it's an ia32 machine? Your /proc/cpuinfo indicates amd64:
https://launchpadlibrarian.net/131346392/ProcCpuinfo.txt

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Hmm, actually it looks like you are running i686, is that correct?

Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote :

Yes, I am running a i686 laptop.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I also uploaded an i686 kernel from the bisect.

The test kernel can be downloaded from:
http://people.canonical.com/~jsalisbury/lp1126539

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Thanks in advance

Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote :

This kernel does not work:

---------------------------------------------------------------------------------------
,id=balloon0,bus=pci.0,addr=0x5
char device redirected to /dev/pts/0 (label charserial0)
KVM: entry failed, hardware error 0x80000021

If you're running a guest on an Intel machine without unrestricted mode
support, the failure can be most likely due to the guest entering an invalid
state for Intel VT. For example, the guest maybe running in big real mode
which is not supported on less recent Intel processors.

EAX=00000000 EBX=00195e21 ECX=fffff000 EDX=fffff000
ESI=00000000 EDI=00000000 EBP=f71eaf44 ESP=f6c2ff90
EIP=c1022487 EFL=00010246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =007b 00000000 ffffffff 00c0f300 DPL=3 DS [-WA]
CS =0060 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
SS =0068 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
DS =007b 00000000 ffffffff 00c0f300 DPL=3 DS [-WA]
FS =00d8 35d5b000 ffffffff 00809300 DPL=0 DS16 [-WA]
GS =00e0 f71ef980 00000018 00409100 DPL=0 DS [--A]
LDT=0000 ffff0000 f0000fff 00f0ff00 DPL=3 CS64 [CRA]
TR =0080 f71ed7c0 0000206b 00008b00 DPL=0 TSS32-busy
GDT= f71e8000 000000ff
IDT= c13ec000 000007ff
CR0=8005003b CR2=ffffffff CR3=0149d000 CR4=000006b0
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
DR6=ffff0ff0 DR7=00000400
EFER=0000000000000000
Code=ff ff 89 10 c3 8b 15 c8 42 3f c1 8d 84 10 00 c0 ff ff 8b 00 <c3> 8b 15 5c 3c 3f c1 53 89 c3 b8 30 00 00 00 ff 92 9c 00 00 00 3c 13 77 0c a1 e4 3e 42 c1

------------------------------------------------------------------------------

BTW, this kernel's nfsv4 kernel module is missing.

Revision history for this message
GONG-YI LIAO (gongyi-liao-gmail) wrote :

BTW, the last Rarng kernel (3.8.0-8-generic) does not work either, exactly the same error log as shown in last post.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Based on you comment in #27, I marked the kernel from comment #26 as bad and updated the bisect.

I built the next test kernel, up to the following commit:
320f5ea0cedc08ef65d67e056bcb9d181386ef2c

The test kernel can be downloaded from:
http://people.canonical.com/~jsalisbury/lp1126539

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

One thing to note, you will need to install both the linux-image and linux-image-extra .deb packages.

Thanks in advance

vladimir (mazovladimir)
Changed in linux (Ubuntu):
status: Confirmed → New
status: New → Confirmed
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.