PXE netboot not booting localboot from virtio-disk

Bug #551545 reported by Christian Roessner
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
QEMU
Fix Released
Undecided
Unassigned
qemu-kvm (Fedora)
Invalid
Medium
qemu-kvm (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

Binary package hint: qemu-kvm

lsb_release -rd
Description: Ubuntu lucid (development branch)
Release: 10.04

apt-cache policy qemu-kvm
qemu-kvm:
  Installiert: 0.12.3+noroms-0ubuntu3
  Kandidat: 0.12.3+noroms-0ubuntu3
  Versions-Tabelle:
 *** 0.12.3+noroms-0ubuntu3 0
        500 http://intranet/ubuntu/ lucid/main Packages
        100 /var/lib/dpkg/status

Description of the problem:

Starting a guest like this:

vdekvm \
  -m 256M \
  -cpu host \
  -smp 1 \
  -name karmic \
  -boot order=nc \
  -drive file=/dev/vg01/test,if=virtio,boot=on,cache=none \
  -net nic,vlan=0,macaddr=00:2f:8d:b6:cf:d0,model=virtio \
  -net vde,vlan=0,sock=/var/run/vde2/vde0.ctl \
  -watchdog i6300esb \
  -vnc :0 \
  -serial telnet:localhost:23,server,nowait \
  -monitor tcp:127.0.0.1:12000,server,nowait \
  -runas kvmguest

On "telnet localhost" you can see that the following boot-menu appears:

- Boot Menu -
=============

local
rescue

It is loaded from this pxelinux.cfg/default file:

SERIAL 0 9600n8

DISPLAY boot.txt

TIMEOUT 120
DEFAULT local
PROMPT 1

LABEL local
 localboot 0

LABEL rescue
 kernel lucid
 append initrd=lucid-initrd.gz rescue/enable=true -- quiet console=ttyS0,9600n8

After the timeout, the guest tries to boot, but fails and reloads the boot menu. This is an endless loop, until I break it or choose the rescue menu entry.

I would expect that it boots from first virtio-disk

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: qemu-kvm 0.12.3+noroms-0ubuntu3
ProcVersionSignature: Ubuntu 2.6.32-18.27-generic 2.6.32.10+drm33.1
Uname: Linux 2.6.32-18-generic x86_64
Architecture: amd64
Date: Tue Mar 30 11:40:59 2010
ExecutablePath: /usr/bin/qemu-system-x86_64
MachineType: MICRO-STAR INTERANTIONAL CO.,LTD MS-7368
ProcCmdLine: root=UUID=0d27271c-feaa-40d9-bbbd-baff4ca1d3cc rw init=/bin/bash
ProcEnviron:
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
SourcePackage: qemu-kvm
dmi.bios.date: 10/31/2007
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: V1.5B2
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: MS-7368
dmi.board.vendor: MICRO-STAR INTERANTIONAL CO.,LTD
dmi.board.version: 1.0
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: To Be Filled By O.E.M.
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrV1.5B2:bd10/31/2007:svnMICRO-STARINTERANTIONALCO.,LTD:pnMS-7368:pvr1.0:rvnMICRO-STARINTERANTIONALCO.,LTD:rnMS-7368:rvr1.0:cvnToBeFilledByO.E.M.:ct3:cvrToBeFilledByO.E.M.:
dmi.product.name: MS-7368
dmi.product.version: 1.0
dmi.sys.vendor: MICRO-STAR INTERANTIONAL CO.,LTD

Revision history for this message
In , James (james-redhat-bugs) wrote :

Description of problem:

All QA automated systems rely on PXE local booting for proper provisioning and testing. All systems are configured in the BIOS to boot PXE first.

When we want to provision the systems, we modify the PXE target (using RHTS or now cobbler).

When we want to boot locally to run tests, we set the default PXE target to "local".

KVM guests do no honor the PXE "local" target. It seems that once you boot PXE, KVM doesn't attach the already installed disks.

Version-Release number of selected component (if applicable):

kernel-2.6.27.5-113.fc10.x86_64
libvirt-0.4.6-3.fc10.x86_64
kvm-74-5.fc10.x86_64

How reproducible:

Every time.

Steps to Reproduce:
1. Set KVM guest PXE target to "Network Boot" using virt-manager
2. Boot the KVM guest.
3. In the PXE menu, type "local"

Actual results:

 * See attached screenshot, xml, and libvirt logfile.

Expected results:

The system should behave as a "real" system behaves and boot the local disk.

Additional info:

 * This makes adding KVM guests into test automation a bit funky since we'll need to do a workaround which involves:

When you want to reprovision a guest:
 1) virsh destroy $GUEST
 2) virsh undefine $GUEST
 3) Edit xml to boot off network
 4) virsh define $XMLFILE
 5) virsh start $GUEST

We'd then need to repeat to have it boot to local disk.

Revision history for this message
In , James (james-redhat-bugs) wrote :

Created attachment 324048
Screenshot

Revision history for this message
In , James (james-redhat-bugs) wrote :

Created attachment 324049
Guest XML configuration

Revision history for this message
In , James (james-redhat-bugs) wrote :

Created attachment 324050
/var/log/libvirt/qemu/vguest2.log

Revision history for this message
In , Michael (michael-redhat-bugs) wrote :

Being able to boot KVM-via-PXE statefully would be highly useful for my testing in Cobbler land as well, and would help with virtual deployment (and re-deployment) of non-Linux guests.

Revision history for this message
In , Daniel (daniel-redhat-bugs) wrote :

The XML only specifies a single device for booting. Can you try setting multiple devices

    <boot dev='network'/>
    <boot dev='cdrom'/>
    <boot dev='hd'/>

Which should tell the BIOS to try to boot network, then cdrom, then harddisk in that order.

Revision history for this message
In , James (james-redhat-bugs) wrote :

Using ...

  <os>
    <type arch='x86_64' machine='pc'>hvm</type>
    <boot dev='network'/>
    <boot dev='cdrom'/>
    <boot dev='hd'/>
  </os>

Results in ...

# cat /var/log/libvirt/qemu/vguest2.log
/usr/bin/qemu-kvm -S -M pc -m 1024 -smp 2 -name vguest2 -monitor pty -boot ndc -drive file=/dev/VolGroup00/vguest2,if=virtio,index=0,boot=on -net nic,macaddr=54:52:00:29:89:e5,vlan=0,model=virtio -net tap,fd=16,script=,vlan=0,ifname=vnet0 -serial pty -parallel none -usb -vnc 127.0.0.1:1 -k en-us
char device redirected to /dev/pts/3
char device redirected to /dev/pts/4
Too many option ROMS

Amy I doing that right?

Revision history for this message
In , Cole (cole-redhat-bugs) wrote :

Wow! I didn't know you could specify multiple boot devs. Using

    <boot dev='network'/>
    <boot dev='hd'/>

And then pressing 'q' to not boot from networking successfully boots from disk. James, try just the above and see if it does the job for you.

Revision history for this message
In , Michael (michael-redhat-bugs) wrote :

Cole, what we are looking for is when the bootloader is fed the following PXE configuration it should boot from the local disk:

DEFAULT local
PROMPT 0
TIMEOUT 0
TOTALTIMEOUT 0
ONTIMEOUT local

LABEL local
        LOCALBOOT 0

This will enable us to create a KVM "empty shell" that we can assign what OS it is running just based on changing the PXE configuration.

Pressing "q" would be interactive and less useful -- you'd have to catch it really really quickly or you'd be reinstalling.

Revision history for this message
In , James (james-redhat-bugs) wrote :

(In reply to comment #7)
> Wow! I didn't know you could specify multiple boot devs. Using
>
> <boot dev='network'/>
> <boot dev='hd'/>
>
> James, try just the above and see if it does the job for you.

With those options in my XML ... my guest fails to start.

# virsh dumpxml vguest2 | grep -C2 "<boot"
  <os>
    <type arch='x86_64' machine='pc'>hvm</type>
    <boot dev='network'/>
    <boot dev='hd'/>
  </os>
  <features>

# virsh start vguest2
libvir: QEMU error : internal error QEMU quit during monitor startup
error: Failed to start domain vguest2

# tail /var/log/libvirt/qemu/vguest2.log
/usr/bin/qemu-kvm -S -M pc -m 1024 -smp 2 -name vguest2 -monitor pty -boot nc -drive file=/dev/VolGroup00/vguest2,if=virtio,index=0,boot=on -net nic,macaddr=54:52:00:29:89:e5,vlan=0,model=virtio -net tap,fd=12,script=,vlan=0,ifname=vnet0 -serial pty -parallel none -usb -vnc 127.0.0.1:1 -k en-us
char device redirected to /dev/pts/3
char device redirected to /dev/pts/4
Too many option ROMS

What am I missing?

Revision history for this message
In , Cole (cole-redhat-bugs) wrote :

jlaska: hmm, works on F9. sounds like a bug.

mdehaan: you may just have to test it and see what happens. I let the guest boot to our pxe server which doesn't seem to have an explicit 'local' option. Hitting enter without a selection seems to imply local, but qemu then prompts for the boot from (n)etwork or (q)uit.

Maybe qemu is smart enough to notice a 'boot from local' directive from the PXE server, and won't prompt. You'll just have to test it since I'm not sure how to go about it.

Revision history for this message
In , Michael (michael-redhat-bugs) wrote :

Cole, that's what james was trying to do above when he filed the bug, and I watched it happen.

"""
KVM guests do no honor the PXE "local" target. It seems that once you boot
PXE, KVM doesn't attach the already installed disks.
"""

What specifically should I test?

Revision history for this message
In , Cole (cole-redhat-bugs) wrote :

I just wasn't sure if:

not entering a selection on my pxe server & pressing enter == deliberately selecting 'boot from local' on another pxe server == having the pxe server tell the machine/VM 'hey, boot from local' (which is what I understand RHTS does).

If those are all equivalent, then it sounds like qemu needs fixing to not prompt based on the pxe request.

Revision history for this message
In , James (james-redhat-bugs) wrote :

My take on this bug is that the F10 kvm/libvirt doesn't let me specify multiple <boot> options. If that were fixed, I suspect it would open the door for PXE "local" booting.

Revision history for this message
In , Daniel (daniel-redhat-bugs) wrote :

Yes, this is a bug in KVM. The trouble is the new -drive flag and its boot=on syntax is broken wrt to normal -boot arg. We need to use boot=on for VirtIO based disks, but when we do that, then this conflicts with the option ROM for PXE boot. This is a big mess and I'm not sure how to fix it, but it certainly needs addressing somehow, because this is a valid use case

Revision history for this message
In , Bug (bug-redhat-bugs) wrote :

This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Revision history for this message
In , Glauber (glauber-redhat-bugs) wrote :

James,

Do you still have this problem if you switch from virtio to e1000?

You should use this XML excerpt:
    <boot dev='network'/>
    <boot dev='hd'/>

Revision history for this message
In , James (james-redhat-bugs) wrote :

Created attachment 324720
vguest1.xml (w/ multiple <boot> and dev="virtio")

Glauber,

Yeah, I still seem to have this problem using virtio.

# virsh start vguest1
libvir: QEMU error : internal error QEMU quit during monitor startup
error: Failed to start domain vguest1

# cat /var/log/libvirt/qemu/vguest1.log
/usr/bin/qemu-kvm -S -M pc -m 1024 -smp 2 -name vguest1 -monitor pty -boot nc -drive file=/dev/VolGroup00/vguest1,if=ide,index=0,boot=on -drive file=,if=ide,media=cdrom,index=2 -net nic,macaddr=54:52:00:55:c8:17,vlan=0,model=virtio -net tap,fd=14,script=,vlan=0,ifname=vnet2 -serial pty -parallel none -usb -vnc 127.0.0.1:3 -k en-us
char device redirected to /dev/pts/8
char device redirected to /dev/pts/9
Too many option ROMS

# virsh dumpxml vguest1
 <!-- see attachment -->

Revision history for this message
In , James (james-redhat-bugs) wrote :

Created attachment 324721
vguest1.xml (w/ multiple <boot> and dev="e1000")

Now with dev="e1000"

# virsh start vguest1
libvir: QEMU error : internal error QEMU quit during monitor startup
error: Failed to start domain vguest1

# cat /var/log/libvirt/qemu/vguest1.log
/usr/bin/qemu-kvm -S -M pc -m 1024 -smp 2 -name vguest1 -monitor pty -boot nc -drive file=/dev/VolGroup00/vguest1,if=ide,index=0,boot=on -drive file=,if=ide,media=cdrom,index=2 -net nic,macaddr=54:52:00:55:c8:17,vlan=0,model=e1000 -net tap,fd=19,script=,vlan=0,ifname=vnet2 -serial pty -parallel none -usb -vnc 127.0.0.1:3 -k en-us
char device redirected to /dev/pts/8
char device redirected to /dev/pts/9
Too many option ROMS

Revision history for this message
In , Glauber (glauber-redhat-bugs) wrote :

I believe the problem itself is very simple (although I don't really know a good solution without thinking a little bit...)

there's only 64k of memory available for option roms, and the virtio rom that ships with our packages is... 64k in size!. So after loading the virtio PXE option rom, we're unable to keep loading option roms, in particular, the extboot option rom we need to kick out virtio boots. ;-(

James said he could boot with an older rom I handled to him, which is 32k in size,
and the problem os "Too many option ROMS" went away.

However, he was still unable to boot from the local target, despite of the fact that he could do a local boot by pressing "q"

So we really have two problems in here:

The first one is that we cannot boot from our current virtio ROM, because it is too large. We can try to quick fix it by building smaller images. This should be a new BZ agains the etherboot package.

And the other, the fact that roms do not honor the local target. For that, I believe we can keep using this BZ.

Revision history for this message
In , James (james-redhat-bugs) wrote :

(In reply to comment #19)
> So we really have two problems in here:
>
> The first one is that we cannot boot from our current virtio ROM, because it is
> too large. We can try to quick fix it by building smaller images. This should
> be a new BZ agains the etherboot package.

Filed this as bug#473137

Revision history for this message
In , Mark (mark-redhat-bugs) wrote :

Apparently this is still a problem with gPXE:

http://www.redhat.com/archives/fedora-virt/2009-October/msg00052.html

Glauber - please take a look

Revision history for this message
In , Fedora (fedora-redhat-bugs) wrote :

This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.

Revision history for this message
In , Bug (bug-redhat-bugs) wrote :

This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle.
Changing version to '13'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Revision history for this message
Christian Roessner (christian-roessner-net) wrote :
Revision history for this message
Mathias Gug (mathiaz) wrote :

According to your command line:

-boot order=nc \

I don't think that this include a local hard disk as part of the list of devices to be considered for booting.

Changed in qemu-kvm (Ubuntu):
status: New → Incomplete
importance: Undecided → Medium
Revision history for this message
Christian Roessner (christian-roessner-net) wrote :

Directly from the manpage:

-boot [order=drives][,once=drives][,menu=on|off]
           Specify boot order drives as a string of drive letters. Valid drive letters depend on the target achitecture. The x86 PC uses: a, b (floppy 1 and
           2), c (first hard disk), d (first CD-ROM), n-p (Etherboot from network adapter 1-4), hard disk boot is the default. To apply a particular boot
           order only on the first startup, specify it via once.

           Interactive boot menus/prompts can be enabled via menu=on as far as firmware/BIOS supports them. The default is non-interactive boot.

                   # try to boot from network first, then from hard disk
                   qemu -boot order=nc

So this should work?? Don't know. Even does not work with not specifying if=virtio.

Revision history for this message
Christian Roessner (christian-roessner-net) wrote :

By the way: PXELinux ignores timeout, if prompt is set. So this seems to be a second bug (this worked on karmic).

Revision history for this message
Dustin Kirkland  (kirkland) wrote : Re: [Bug 551545] Re: PXE netboot not booting localboot from virtio-disk

Also, vde networking will not work with Lucid's kvm. To use vde
networking, we'd need to build qemu-kvm with libvde2, which we cannot
do because it's in Universe.

Please consider using one of the other more secure, officially
supported networking models:
 * https://help.ubuntu.com/community/KVM/Networking

Revision history for this message
Christian Roessner (christian-roessner-net) wrote :

VDE is very great. I use it since many months and had NEVER any problems. There is no better solution than cde. And I do not understand, why you do not put it into main repo. Sayin: insecure is not a good answer without telling where.

So does it mean, the wrapper vdekvm will be kicked in Lucid? That would break all servers, which used it!

Revision history for this message
In , Jan (jan-redhat-bugs) wrote :

Still problem on Fedora 13 final + updates testing. Any change to fix this?

Revision history for this message
In , Jan (jan-redhat-bugs) wrote :

I have some success to boot using PXE by booting manually. May be there is too short default timeout for dhcp request. Try this:

1. start virtual machine
2. when you are prompted to press CTRL-B do it
3. try to get dhcp address running this command: dhcp net0
4. repeat step 3 until you do not get address (reply "ok")
5. boot using command: autoboot

If you run "dhcp net0" command immediatelly, it will fail fist time, but second run gets IP address. Then I am able to boot from PXE.

Revision history for this message
In , Jan (jan-redhat-bugs) wrote :

I think local boot works well on current fedora 13 stable. Do you still have this problem?

But another problem described here (timeout to boot from PXE) is still present. Should I open a new bug for this? Looks like it's enough to increase PXE network timeout by aprox. 3 seconds. Most simpler workaround is to select "Send Key -> Ctrl-Alt-Del" from menu immediatelly (or after 1-3 seconds) after guest start.

Revision history for this message
Reinhard Tartler (siretart) wrote :

this bug has nothing to do with VDE. It seems that ubuntu's current version of libvirt/kvm-qemu does not implement boot ordering correctly.

this seems to be present in fedora as well: https://bugzilla.redhat.com/show_bug.cgi?id=472236

Changed in qemu-kvm (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
In , Michael (michael-redhat-bugs) wrote :

I'm still having this dhcp timeout issue on f13.

Opened https://bugzilla.redhat.com/show_bug.cgi?id=638735 to track it.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hi,

could you test whether you still have this problem with lucid-proposed?

Changed in qemu-kvm (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Reinhard Tartler (siretart) wrote :

lucid-updates and lucid-proposed ship the same package and from the changelog I cannot see what change would be related to this big.

I've just confirmed by testing that the bug still applies to the most uptodate packages that are available for lucid.

Changed in qemu-kvm (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
In , Bug (bug-redhat-bugs) wrote :

This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that
we may not be able to fix it before Fedora 13 is end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora please change the 'version' of this
bug to the applicable version. If you are unable to change the version,
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

The process we are following is described here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Revision history for this message
In , Bug (bug-redhat-bugs) wrote :

Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Revision history for this message
In , Shawn (shawn-redhat-bugs) wrote :

Reopen, bump to rawhide, I haven't been able to test this recently.

Revision history for this message
Andreas Ntaflos (daff) wrote :

Still a problem in Lucid, making automatic installation and deployment of VMs (using Cobbler or Foreman) pretty much impossible without manual intervention. This, of course, defeats the whole point of automatic installation and deployment.

Revision history for this message
In , Fedora (fedora-redhat-bugs) wrote :

This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.

Revision history for this message
In , Cole (cole-redhat-bugs) wrote :

With virt-manager on F17 this works for me, you just need to make sure that both network and harddrive boot options are selected, otherwise the disks aren't marked as bootable and things probably won't work.

Closing as WORKSFORME, please reopen if anyone still has issues on F17+

Revision history for this message
In , Renich (renich-redhat-bugs) wrote :

Created attachment 600144
no prompt

Revision history for this message
In , Renich (renich-redhat-bugs) wrote :

it seems it's not even prompting for ipxe now. I think something got hardcoded into the rom by accident.

Can somebody verify?

Revision history for this message
In , Cole (cole-redhat-bugs) wrote :

Renich, given how old and long this bug report is, let's keep it closed. If you are still experiencing a similar issue, please open a new bug report with the following info:

Fedora version
qemu version
qemu command line (if using libvirt, /var/log/libvirt/qemu/$vmname.log)

At least on F17, PXE and boot from local is working fine for me.

Revision history for this message
Ryan Harper (raharper) wrote :

This issue should be fixed in the qemu-kvm version included in precise.

Changed in qemu-kvm (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Thomas Huth (th-huth) wrote :

Since it has been fixed in Precise ... I assume this has also been fixed in upstream QEMU? Or is there still anything left to do here?

Changed in qemu:
status: New → Incomplete
Revision history for this message
Thomas Huth (th-huth) wrote :

There hasn't been a reply to my question in the last comment within months, so I assume this has been fixed in upstream, too. Closing this ticket now...

Changed in qemu:
status: Incomplete → Fix Released
Changed in qemu-kvm (Fedora):
importance: Unknown → Medium
status: Unknown → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.