[regression] iPXE kills kvm with KVM: entry failed, hardware error 0x80000021
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| linux (Ubuntu) |
Critical
|
Joseph Salisbury | ||
| Quantal |
Critical
|
Joseph Salisbury |
Bug Description
This is a regression - I was using iPXE last night just fine on the same hardware (an i7-860)
I've got a guest (xml attached) with no discs that I'm trying to netboot, but as soon as iPXE starts I see:
iPXE (PCI 00:03.0) starting execution...
and it stops - the libvirt log shows:
KVM: entry failed, hardware error 0x80000021
If you're running a guest on an Intel machine without unrestricted mode
support, the failure can be most likely due to the guest entering an invalid
state for Intel VT. For example, the guest maybe running in big real mode
which is not supported on less recent Intel processors.
EAX=00000011 EBX=00000000 ECX=00000030 EDX=00007baa
ESI=e007f50a EDI=0003b890 EBP=00000000 ESP=00007baa
EIP=00000382 EFL=00010006 [-----P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0030 0009c6f0 ffffffff 0000f300 DPL=3 DS16 [-WA]
CS =9be4 0009be40 0000ffff 00009b00 DPL=0 CS16 [-RA]
SS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
DS =0030 0009c6f0 ffffffff 0000f300 DPL=3 DS16 [-WA]
FS =0030 0009c6f0 ffffffff 0000f300 DPL=3 DS16 [-WA]
GS =0030 0009c6f0 ffffffff 0000f300 DPL=3 DS16 [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT= 0009c730 00000037
IDT= 00000000 0000ffff
CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
DR0=00000000000
DR6=00000000fff
EFER=0000000000
Code=66 0f 01 16 40 00 66 0f 01 1e 78 00 0f 20 c0 0c 01 0f 22 c0 <66> ea a4 00 00 00 08 00 0f 20 c0 24 fe 0f 22 c0 ff 2e 7
e 00 2e a1 08 08 8e d8 8e c0 8e e0
ProblemType: Bug
DistroRelease: Ubuntu 12.10
Package: qemu-kvm 1.1~rc+
ProcVersionSign
Uname: Linux 3.5.0-13-generic x86_64
ApportVersion: 2.5.1-0ubuntu4
Architecture: amd64
Date: Sun Sep 2 17:09:41 2012
InstallationMedia: Kubuntu 12.10 "Quantal Quetzal" - Alpha amd64 (20120717)
MachineType: To Be Filled By O.E.M. To Be Filled By O.E.M.
ProcEnviron:
LANGUAGE=en_GB:en
TERM=xterm
PATH=(custom, no user)
LANG=en_GB.UTF-8
SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=
SourcePackage: qemu-kvm
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 09/10/2009
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: P1.50
dmi.board.name: P55M Pro
dmi.board.vendor: ASRock
dmi.chassis.
dmi.chassis.type: 3
dmi.chassis.vendor: To Be Filled By O.E.M.
dmi.chassis.
dmi.modalias: dmi:bvnAmerican
dmi.product.name: To Be Filled By O.E.M.
dmi.product.
dmi.sys.vendor: To Be Filled By O.E.M.
Dave Gilbert (ubuntu-treblig) wrote : | #1 |
Dave Gilbert (ubuntu-treblig) wrote : | #2 |
Dave Gilbert (ubuntu-treblig) wrote : | #3 |
Dave Gilbert (ubuntu-treblig) wrote : | #4 |
This looks like it's kernel not qemu:
works when the host is running 3.5.0-12-generic #12-Ubuntu SMP Fri Aug 24 18:28:43
fails with 3.5.0-13-generic #14-Ubuntu SMP Wed Aug 29 16:48:44
affects: | qemu-kvm (Ubuntu) → linux (Ubuntu) |
Dave Gilbert (ubuntu-treblig) wrote : | #5 |
Works correctly with the linux-headers-
Changed in linux (Ubuntu): | |
status: | New → Confirmed |
tags: | added: regression-release |
Changed in linux (Ubuntu): | |
importance: | Undecided → Critical |
James Page (james-page) wrote : | #6 |
I picked this up when trying to run iscsi tests for beta-1 quantal
reverting to the previous kernel version fixed this problem.
For information I can reproduce this on demand using the scripts that I use for iscsi root testing.
tags: | added: release-q-incoming |
tags: | added: kernel-key |
tags: | added: kernel-fixed-upstream |
Changed in linux (Ubuntu): | |
status: | Confirmed → Triaged |
Joseph Salisbury (jsalisbury) wrote : | #7 |
I'd like to perform a "Reverse" bisect to identify the commit that fixed this bug in v3.6-rc4. Would it be possible for folks affected by this bug to assist in the reverse bisect by testing some kernels?
If so, we first need to identify which release candidate in v3.6 introduced the fix. If you can reproduce this bug, it would be great if you can test the following kernels and report back the first release candidate that fixes the bug:
v3.6-rc1: http://
v3.6-rc2: http://
v3.6-rc3: http://
You will need to install both the linux-image and linux-image-extra .deb packages.
There no need to test all of those kernels, just up until you find the first one that does not have the bug.
Thanks in advance!
Tim Gardner (timg-tpi) wrote : | #8 |
Joe - The big difference from 3.5.0-12 to 3.5.0-13 is the rebase from stable v3.5.2 to v3.5.3. These commits from v3.5.3 jump out at me as possible root cause:
KVM: VMX: Fix KVM_SET_SREGS with big real mode segments
KVM: x86 emulator: fix byte-sized MOVZX/MOVSX
KVM: VMX: Fix ds/es corruption on i386 with preemption
KVM: x86: apply kvmclock offset to guest wall clock time
KVM: PIC: call ack notifiers for irqs that are dropped form irr
Joseph Salisbury (jsalisbury) wrote : | #9 |
Thanks, Tim.
Can folks first test the following two kernels instead of the one's mentioned in comment #7?
v3.5.2: http://
v3.5.3: http://
I can then build some test kernels if we find that the bug is fixed in v3.5.3.
Albert Damen (albrt) wrote : | #10 |
I am seeing this same error message when I start kvm with an empty disk attached right when pxe kicks in. With kernel v3.6-rc3 it works fine, with v3.5.3it still fails.
I found a comment from the upstream developer at http://<email address hidden>
"We'll try to get emulate_
3.6-rc3 indeed has emulate_
Unloading kvm_intel and reloading it with emulate_
Dave, could you test if setting emulate_
If that works for Dave as well, I guess the winning upstream commit is "KVM: VMX: Emulate invalid guest state by default" (a27685c33a on main tree from kernel.org).
Dave Gilbert (ubuntu-treblig) wrote : | #11 |
Albert:
Still on 3.5.0-13, if I use the emulate_
although if I leave it to fall through iPXE to the DVD boot it fails differently below.
Dave
KVM internal error. Suberror: 1
emulation failure
EAX=00000000 EBX=0000a715 ECX=00000000 EDX=00004e8c
ESI=000057fc EDI=000058c8 EBP=00007b9e ESP=00007b9e
EIP=0000a72b EFL=00010046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 0000f300 DPL=3 DS16 [-WA]
CS =0020 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
SS =0000 00000000 0000ffff 0000f300 DPL=3 DS16 [-WA]
DS =0000 00000000 0000ffff 0000f300 DPL=3 DS16 [-WA]
FS =0000 00000000 00000000 00000000
GS =0000 00000000 00000000 00000000
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 feffd000 00002088 00008b00 DPL=0 TSS32-busy
GDT= 0000aa80 0000002f
IDT= 000030b8 000007ff
CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
DR0=00000000000
DR6=00000000fff
EFER=0000000000
Code=29 02 74 01 fb ff 55 2e 66 bb 26 9d eb 20 31 c0 8e e0 8e e8 <0f> 00 d0 b0 28 8e c0 8e d8 8e d0 b0 08 0f 00 d8 8b 25 bc ae 00 00 89 e8 ff e3 fa fc 89 25
qemu: terminating on signal 15 from pid 1699
Dave Gilbert (ubuntu-treblig) wrote : | #12 |
Joseph:
From comment #9
3.5.2 version works
3.5.3 fails
Dave
Albert Damen (albrt) wrote : | #13 |
As both the error message and the commit message refer to "big real mode" I tried a locally build 3.5.3 kernel with commit "KVM: VMX: Fix KVM_SET_SREGS with big real mode segments" reverted.
With an unmodified local build 3.5.3 I could reproduce the original error.
With the local build 3.5.3 with the reverted commit the vm boots fine. It passes pxe and boots successfully from cdrom (server beta1 iso). Also the installed server boots fine with the revert.
NB: I left the module parameters at their default values, so especially emulate_
tags: | removed: kernel-key |
tags: | added: kernel-da-key |
summary: |
- iPXE kills kvm with KVM: entry failed, hardware error 0x80000021 + [regression] iPXE kills kvm with KVM: entry failed, hardware error + 0x80000021 |
Joseph Salisbury (jsalisbury) wrote : | #14 |
Thanks for testing, Dave. I will start a bisect between 3.5.2 and 3.5.3. I'll post a test kernel shortly.
Joseph Salisbury (jsalisbury) wrote : | #15 |
I just re-read comment #13. I'll first build a Quantal test kernel with commit "KVM: VMX: Fix KVM_SET_SREGS with big real mode segments" reverted.
Thanks for finding that, Albert!
I'll post it shortly.
Changed in linux (Ubuntu Quantal): | |
assignee: | nobody → Joseph Salisbury (jsalisbury) |
status: | Triaged → In Progress |
Joseph Salisbury (jsalisbury) wrote : | #16 |
I built a Quantal test kernel with commit b398aa31 reverted. The kernel is available at:
http://
Can folks affected by this bug test that kernel, and post back if it fixes the issue or not?
Thanks in advance!
Dave Gilbert (ubuntu-treblig) wrote : | #17 |
Hi Joseph,
Linux major 3.5.0-15-generic #22~lp1045027v1 SMP Thu Sep 20 19:26:02 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
That seems to work; Thanks!
I guess one thought is that since 3.6-rc3 works fine, it would probably be better to pick whatever fixed it from the newer one.
Dave
Albert Damen (albrt) wrote : | #18 |
linux-image-
Joseph Salisbury (jsalisbury) wrote : | #19 |
Thanks for testing, Dave and Albert.
I'll take a look at the 3.5-rc3 changelog and see if I can spot the fix for this.
Joseph Salisbury (jsalisbury) wrote : | #20 |
Just to confirm, can you test upstream v3.6-rc2[0], and confirm it has the bug. Then test v3.6-rc3[1] and confirm it is fixed?
If that is the case, I can perform a reverse bisect to identify the commit that fixes the bug.
[0] http://
[1] http://
Dave Gilbert (ubuntu-treblig) wrote : | #21 |
nack - both 3.6.-rc2 and 3.6-rc3 work.
Linux major 3.6.0-030600rc3
Linux major 3.6.0-030600rc2
Dave
Dave Gilbert (ubuntu-treblig) wrote : | #22 |
and 3.6-rc1 is good as well:
Linux major 3.6.0-030600rc1
Dave
Joseph Salisbury (jsalisbury) wrote : | #23 |
@Dave,
Thanks for testing. That is an interesting test result, since the commit that causes this bug in v3.5.3 has also been applied against v3.6-rc1:
commit b246dd5df139501
Author: Orit Wasserman <email address hidden>
Date: Thu May 31 14:49:22 2012 +0300
KVM: VMX: Fix KVM_SET_SREGS with big real mode segments
However, this bug does not seem to happen in v3.6.
Joseph Salisbury (jsalisbury) wrote : | #24 |
For completeness, can you test v3.5.4:
http://
Dave Gilbert (ubuntu-treblig) wrote : | #25 |
Yeh, already tested 3.5.4 - it's broken ( Linux major 3.5.4-030504-
There are a heck of a lot of differences in the kvm code between 3.5.x and 3.6.x (about 400 patches from what I can see), so it's a bit of a big job to find it.
Dave
Joseph Salisbury (jsalisbury) wrote : | #26 |
@Albert,
Can you also confirm that v3.6-rc2[0] and v3.6-rc1[1] do not have the bug?
[0] http://
[1] http://
Kate Stewart (kate.stewart) wrote : | #27 |
removed tag, since was already accepted to series.
tags: | removed: release-q-incoming |
Albert Damen (albrt) wrote : | #28 |
Joseph,
with default module parameters for 3.6, meaning emulate_
However, with emulate_
So I don't think there is currently a fix available for the quantal kernel where emulate_
Joseph Salisbury (jsalisbury) wrote : | #29 |
Upstream has provided some feedback. 3.6 has had a lot more (unbackportable) work in this area. Upstream will see if this can be fixed, if not, they will revert it.
Joseph Salisbury (jsalisbury) wrote : | #30 |
Since an upstream fix is not scheduled to happen soon, a revert of commit b398aa31 will be requested for Quantal.
Changed in linux (Ubuntu Quantal): | |
status: | In Progress → Fix Committed |
Launchpad Janitor (janitor) wrote : | #31 |
This bug was fixed in the package linux - 3.5.0-16.24
---------------
linux (3.5.0-16.24) quantal-proposed; urgency=low
[ Andy Whitcroft ]
* SAUCE: ata_piix: add a disable_driver option
- LP: #994870
[ Christian König ]
* (pre-stable) drm/radeon: make 64bit fences more robust v3 (3.5 stable)
- LP: #1029582
[ David Henningsson ]
* SAUCE: ALSA: hda - use both input paths on Conexant auto parser
- LP: #1037642
* SAUCE: ALSA: hda - fix control names for multiple speaker out on
IDT/STAC
- LP: #1046734
[ Herton Ronaldo Krzesinski ]
* SAUCE: ALSA: hda/via - don't report presence on HPs with no presence
support
- LP: #1052499
* SAUCE: ext4: fix crash when accessing /proc/mounts concurrently
- LP: #1053019
* SAUCE: ALSA: hda/realtek - Fix detection of ALC271X codec
- LP: #1006690
[ Kyle Fazzari ]
* SAUCE: input: Cypress PS/2 Trackpad fix disabling tap-to-click
- LP: #1048816
[ Leann Ogasawara ]
* [Config] Disable CONFIG_DRM_AST
- LP: #1053290
[ Stefan Bader ]
* [Config] Disable the Cirrus QEMU drm driver
- LP: #1038055
[ Upstream Kernel Changes ]
* Revert "KVM: VMX: Fix KVM_SET_SREGS with big real mode segments"
- LP: #1045027
* x86, efi: Handover Protocol
* drm/i915: HDMI - Clear Audio Enable bit for Hot Plug
- LP: #1056729
* UBUNTU SAUCE: apparmor: fix IRQ stack overflow
- LP: #1056078
* drm/nouveau: fix booting with plymouth + dumb support
- LP: #1043518
* ALSA: hda - Add DeviceID for Haswell HDA
- LP: #1057698
* ALSA: hda - add Haswell HDMI codec id
- LP: #1057698
* ALSA: hda - Fix driver type of Haswell controller to AZX_DRIVER_SCH
- LP: #1057698
* ALSA: hda_intel: Add Device IDs for Intel Lynx Point-LP PCH
- LP: #1011438, #1057698
[ Wang Xingchao ]
* SAUCE: ALSA: hda - Add another pci id for Haswell board
- LP: #1057698
[ Wen-chien Jesse Sung ]
* SAUCE: drm/i915: Explicitly disable RC6 for certain models
- LP: #1002170, #1008867
-- Leann Ogasawara <email address hidden> Thu, 27 Sep 2012 13:55:52 -0700
Changed in linux (Ubuntu Quantal): | |
status: | Fix Committed → Fix Released |
Dave Gilbert (ubuntu-treblig) wrote : | #32 |
Looks good:
Linux major 3.5.0-16-generic #24-Ubuntu SMP Thu Sep 27 23:57:26 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Thanks all!
Dave
GONG-YI LIAO (gongyi-liao-gmail) wrote : | #34 |
This bug can be reproduced in Raring (kernel: 3.8.0-6-generic ) on a Laptop with core duo processor T2300 (32-bit).
Dave Gilbert (ubuntu-treblig) wrote : | #35 |
Hi GONG-Yi,
I've just tested it on the same machine I originally reported this with and it seems fine to me; so whatever it is it seems like a separate variation; can you submit a new bug stating the version of linux, qemu and iPXE your using and including the log from /var/log/
Please add a comment here stating your new bug number.
GONG-YI LIAO (gongyi-liao-gmail) wrote : | #36 |
I am not sure if this is caused by different CPU, I have almost exactly the same setting on another laptop (Thinkpad T61, CPU T7500 64-bit Ubuntu Raring), the KVM quest runs without any problems. But the same setting on the laptop with CPU T2300 32-bit Ubuntu Raring, KVM just pauses.
the following are the logs from the 32-bit T2300 laptop:
-------
W: kvm binary is deprecated, please use qemu-system-x86_64 instead
char device redirected to /dev/pts/2
KVM: entry failed, hardware error 0x80000021
If you're running a guest on an Intel machine without unrestricted mode
support, the failure can be most likely due to the guest entering an invalid
state for Intel VT. For example, the guest maybe running in big real mode
which is not supported on less recent Intel processors.
EAX=00000000 EBX=00195e13 ECX=fffff000 EDX=fffff000
ESI=00000000 EDI=00000000 EBP=f71eaf44 ESP=f6c31f90
EIP=c1022487 EFL=00010246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =007b 00000000 ffffffff 00c0f300 DPL=3 DS [-WA]
CS =0060 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
SS =0068 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
DS =007b 00000000 ffffffff 00c0f300 DPL=3 DS [-WA]
FS =00d8 35d5b000 ffffffff 00809300 DPL=0 DS16 [-WA]
GS =00e0 f71ef980 00000018 00409100 DPL=0 DS [--A]
LDT=0000 ffff0000 f0000fff 00f0ff00 DPL=3 CS64 [CRA]
TR =0080 f71ed7c0 0000206b 00008b00 DPL=0 TSS32-busy
GDT= f71e8000 000000ff
IDT= c13ec000 000007ff
CR0=8005003b CR2=ffffffff CR3=0149d000 CR4=000006f0
DR0=00000000000
DR6=00000000fff
EFER=0000000000
Code=ff ff 89 10 c3 8b 15 c8 42 3f c1 8d 84 10 00 c0 ff ff 8b 00 <c3> 8b 15 5c 3c 3f c1 53 89 c3 b8 30 00 00 00 ff 92 9c 00 00 00 3c 13 77 0c a1 e4 3e 42 c1
qemu: terminating on signal 15 from pid 2825
-------
Dave Gilbert (ubuntu-treblig) wrote : | #37 |
Hi Gong-Yi,
OK, as I say it's probably best to open as a new bug and then add the comment back here to state the bug number you got.
Also, is it happening when doing the same thing - i.e. when starting iPXE?
Sebastien Douche (sdouche) wrote : | #38 |
Reproduced in Quantal with the last kernel (3.5.0-30-generic):
2013-05-23 15:39:27.176+0000: starting up
LC_ALL=C PATH=/usr/
-smp 2,sockets=
emu/windows1.
=0x1.0x2 -drive file=/srv/
rive file=/srv/
tap,fd=
evice isa-serial,
char device redirected to /dev/pts/1
KVM: entry failed, hardware error 0x80000021
If you're running a guest on an Intel machine without unrestricted mode
support, the failure can be most likely due to the guest entering an invalid
state for Intel VT. For example, the guest maybe running in big real mode
which is not supported on less recent Intel processors.
EAX=00000010 EBX=00000080 ECX=00000000 EDX=00000080
ESI=0025da4a EDI=0007da4a EBP=00001f20 ESP=00000200
EIP=0000009b EFL=00000002 [-------] CPL=3 II=0 A20=1 SMM=0 HLT=0
ES =0020 00000200 0000ffff 00009300
CS =b000 002b0000 0000ffff 0000f300
SS =0020 00000200 0000ffff 0000f300
DS =0020 00000200 0000ffff 00009300
FS =0020 00000200 0000ffff 00009300
GS =0020 00000200 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT= 002b0000 00000027
IDT= 00000000 000003ff
CR0=00000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=00000000000
DR6=00000000fff
EFER=0000000000
Code=02 00 00 ea 91 00 00 00 18 00 0f 20 c0 66 83 e0 fe 0f 22 c0 <66> 31 c0 8e d8 8e c0 8e d0 66 bc 00 04 00 00 8e e0 8e e8 ea 00 00 00 20 00 00 00 20 4a da
Adam Conrad (adconrad) wrote : Update Released | #39 |
The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.
Rob (xrctp1) wrote : | #40 |
This bug still exists in 12.04.4
hmm, works when the host is running 3.5.0-12-generic #12-Ubuntu SMP Fri Aug 24 18:28:43
fails with 3.5.0-13-generic #14-Ubuntu SMP Wed Aug 29 16:48:44