grub hangs at early booting after handoff from PXE

Bug #625383 reported by Mathieu Trudel-Lapierre on 2010-08-27
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Release Notes for Ubuntu
Undecided
Colin Watson
syslinux (Ubuntu)
High
Colin Watson
Maverick
High
Canonical Foundations Team

Bug Description

Binary package hint: grub2

On three different Dell Inspiron 1545 (and 1546) systems, grub hangs right after hand-off from PXE.

I've tried 'localboot 0', 'localboot 0x80', and 'localboot -1'. -1 seems to work but only on the Inspiron 1546.

'localboot 5' was also attempted in case the issue was with how the PXE stack was getting unloaded, but it doesn't seem to change behavior at all.

When the systems are booted directly to the hard drive (either using F12 to boot to the hard drive directly, or escaping out of PXE before it does boot off the network, grub loads normally and hands off to the kernel without a hitch.

Installing grub with --debug-image=all yields no debugging information whatsoever if booting using PXE. Behavior is normal if booted straight to the hard drive (debug information is shown).

Ok, apparently -1 works on the Inspiron 1545 as well, but this solution is not usable as it makes other machines (e.g. Toshiba systems) fail to boot in the same way described above.

Ameet Paranjape (ameetp) wrote :

Mathieu,

Are you able to provide any screen shot or log of the hang from a remote console?

Changed in grub2 (Ubuntu):
importance: Undecided → High
status: New → Incomplete
Hankyone (hankyone) wrote :

This system has grub debug image installed, yet nothing shows up.

Ameet Paranjape (ameetp) wrote :

Foundations team,

Please update the bug with any other suggestions to pull debug logs. Thanks.

Changed in grub2 (Ubuntu):
status: Incomplete → Triaged
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Colin Watson (cjwatson) wrote :

It seems unlikely to me that this is a GRUB bug. The evidence suggests that GRUB is simply not being loaded at all.

Here is the Syslinux documentation on LOCALBOOT, and note that the permissible values for PXELINUX are *not* the same as for ISOLINUX:

    LOCALBOOT type [ISOLINUX, PXELINUX]
        On PXELINUX, specifying "LOCALBOOT 0" instead of a "KERNEL"
        option means invoking this particular label will cause a local
        disk boot instead of booting a kernel.

        The argument 0 means perform a normal boot. The argument 4
        will perform a local boot with the Universal Network Driver
        Interface (UNDI) driver still resident in memory. Finally,
        the argument 5 will perform a local boot with the entire PXE
        stack, including the UNDI driver, still resident in memory.
        All other values are undefined. If you don't know what the
        UNDI or PXE stacks are, don't worry -- you don't want them,
        just specify 0.

        On ISOLINUX, the "type" specifies the local drive number to
        boot from; 0x00 is the primary floppy drive and 0x80 is the
        primary hard drive. The special value -1 causes ISOLINUX to
        report failure to the BIOS, which, on recent BIOSes, should
        mean that the next boot device in the boot sequence should be
        activated.

In addition, the Syslinux 4.00 changelog says that -1 is also supported for PXELINUX. However, 0x80 is out of spec for PXELINUX.

If the BIOS doesn't do what you want for any of these defined values, then it is not at all clear to me that there is anything that we can do about it. I don't know what to suggest.

affects: grub2 (Ubuntu Maverick) → syslinux (Ubuntu Maverick)
Ameet Paranjape (ameetp) wrote :

Colin,

Spoke to Anouar and he agrees that it doesn't appear we can do anything about this.

I'm going to mark this bug as a Won't Fix for now.

Changed in syslinux (Ubuntu Maverick):
status: Triaged → Won't Fix

Ameet, Colin,

Just to clarify things (though I agree there is sadly little than can be done anyway) I marked this bug as affecting grub because (if I recall correctly) using Shift would still sometimes yield the GRUB menu. We'll re-test this in the lab first though.

Ameet Paranjape (ameetp) wrote :

@Mathieu,

Any update on your tests?

Yes, it doesn't even get to accepting Shift, so it's as mentioned by Colin: this affects syslinux, not grub.

Dave Walker (davewalker) wrote :

Syslinux seems to of made a few undocumented changes in Maverick. Another example is bug #610017.

Colin Watson (cjwatson) wrote :

Dave: that doesn't seem obviously relevant. We did catch up on quite a few upstream versions.

Ameet Paranjape (ameetp) on 2010-10-04
Changed in ubuntu-release-notes:
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Colin Watson (cjwatson) wrote :

We fixed this in the cert lab by using chain.c32 rather than LOCALBOOT. As far as I know, this problem was due to a PXE BIOS bug rather than a syslinux change, although it's possible that an otherwise innocuous syslinux change managed to tickle some previously undisturbed PXE BIOS bugs.

Changed in syslinux (Ubuntu):
assignee: Canonical Foundations Team (canonical-foundations) → Colin Watson (cjwatson)
status: Triaged → Won't Fix
Colin Watson (cjwatson) wrote :

Release-noted, insofar as it's possible:

 * Some users booting machines from the network using PXE, and using the `LOCALBOOT` facility in Syslinux to hand off to a local hard disk, found that there was no argument that would successfully cause a local boot. Syslinux provides a `chain.c32` COM32 image which is less reliant on PXE BIOS implementation details. (Bug:625383)

Changed in ubuntu-release-notes:
assignee: Canonical Foundations Team (canonical-foundations) → Colin Watson (cjwatson)
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers