os-prober misparses GRUB 2.00 configuration, leading to arbitrarily-long grub.cfg output lines

Bug #1050774 reported by dino99
64
This bug affects 6 people
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
Invalid
High
Unassigned
os-prober (Ubuntu)
Fix Released
High
Colin Watson

Bug Description

Lubuntu Quantal i386

Yesterday i've tested grub 2.0 fromm the cjwatson ppa without issue. Then i've ran vbeinfo on next boot to fill custom values : GRUB_GFXMODE=1024x768,800x600

 Today the new 2.00-3ubuntu1 is upgraded and get that problem. Note that ubuntu quantal i386 have been previously upgraded with this latest grub 2.0 without that issue.

ProblemType: Crash
DistroRelease: Ubuntu 12.10
Package: grub-common 2.00-3ubuntu1
Uname: Linux 3.6.0-030600rc5-generic i686
NonfreeKernelModules: nvidia
ApportVersion: 2.5.1-0ubuntu7
Architecture: i386
AssertionMessage: *** glibc detected *** /usr/bin/grub-script-check: double free or corruption (!prev): 0x08210b38 ***
Date: Fri Sep 14 09:06:51 2012
ExecutablePath: /usr/bin/grub-script-check
ProcCmdline: /usr/bin/grub-script-check /boot/grub/grub.cfg.new
Signal: 6
SourcePackage: grub2
StacktraceTop:
 __libc_message (do_abort=2, fmt=0xb77271dc "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:201
 malloc_printerr (action=<optimized out>, str=<optimized out>, ptr=0x8210b38) at malloc.c:5007
 ?? ()
 ?? ()
 ?? ()
Title: grub-script-check assert failure: *** glibc detected *** /usr/bin/grub-script-check: double free or corruption (!prev): 0x08210b38 ***
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

Related branches

Revision history for this message
dino99 (9d9) wrote :
visibility: private → public
Revision history for this message
dino99 (9d9) wrote :

grub.cfg.new

Revision history for this message
Apport retracing service (apport) wrote :

StacktraceTop:
 __libc_message (do_abort=2, fmt=0xb77271dc "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:201
 malloc_printerr (action=<optimized out>, str=<optimized out>, ptr=0x8210b38) at malloc.c:5007
 ?? ()
 ?? ()
 ?? ()

Revision history for this message
Apport retracing service (apport) wrote : Stacktrace.txt
Revision history for this message
Apport retracing service (apport) wrote : StacktraceSource.txt
Revision history for this message
Apport retracing service (apport) wrote : ThreadStacktrace.txt
Changed in grub2 (Ubuntu):
status: New → Invalid
Revision history for this message
Apport retracing service (apport) wrote : Crash report cannot be processed

Thank you for your report!

However, processing it in order to get sufficient information for the
developers failed (it does not generate a useful symbolic stack trace). This
might be caused by some outdated packages which were installed on your system
at the time of the report:

outdated debug symbol package for initscripts: package version 2.88dsf-13.10ubuntu13 dbgsym version 2.88dsf-13.10ubuntu11.1

Please upgrade your system to the latest package versions. If you still
encounter the crash, please file a new report.

Thank you for your understanding, and sorry for the inconvenience!

tags: removed: need-i386-retrace
Revision history for this message
Colin Watson (cjwatson) wrote : Re: grub-script-check assert failure: *** glibc detected *** /usr/bin/grub-script-check: double free or corruption (!prev): 0x08210b38 ***

I have reproduced this locally, although I'm not sure whether I'll be able to fix it before Monday at this point.

Changed in grub2 (Ubuntu):
importance: Undecided → High
status: Invalid → Triaged
assignee: nobody → Colin Watson (cjwatson)
Revision history for this message
Colin Watson (cjwatson) wrote :

This is actually a bug somewhere in the grub-mkconfig stack. There's a series of increasingly-implausible lines of the form:

        menuentry 'Ubuntu'\\''\'\\'''\\'' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option '\\''\'\\'''\\''gnulinux-simple-aa1bddfe-1ddb-4832-bee9-eb31c56faefa (on /dev/sdc4)'\\'' --class gnu-linux --class gnu --class os $menuentry_id_option '\\''osprober-gnulinux-/boot/vmlinuz-3.6.0-030600rc5-generic--aa1bddfe-1ddb-4832-bee9-eb31c56faefa (on /dev/sdc5)' --class gnu-linux --class gnu --class os $menuentry_id_option 'osprober-gnulinux-/boot/vmlinuz-3.6.0-030600rc5-generic--00c5de83-479c-4ab0-9b54-9af0a727175e' {

... and eventually these get long enough to overflow grub-script-check's buffer. The buffer size is sufficiently large that I can't really imagine it being hit intentionally, so I'd rather fix whatever's generating this madness.

Revision history for this message
Colin Watson (cjwatson) wrote :

Could you please attach a tarball of your /etc/grub.d/ directory? Your grub.cfg indicates that you have several custom files in there, and I'd like to make sure I'm investigating the code you're actually running.

Revision history for this message
dino99 (9d9) wrote :

Hi Colin,
i'm happy you take this issue in consideration. Things are going worst on that desktop mixing 2 pata & 1 sata on which quantal i386 ubuntu & quantal i386 Lubuntu are installed, but also 2 Precise with their different grub version.

Anyway now i cant boot at all:
- wonder why grub install is saying that i absolutly need to set the sdc hdd only to boot from the bios, even each hdd have grub installed on their mbr.
- no matter wich hdd i select to boot on, i only get "alloc magic is broken at ..." , so i'v'e used several time an updated boot-repair script to try fixing it, but no luck.
- so i wonder if the vbeinfo filled on the GRUB_GFXMODE are correctly filled (reading the wiki its ok to set more than one value if separated by a comma). I've chrooted to test other values: only 1024x768 but also 1024x768x32 (valide vbeinfo values)

Will post tomorrow the grub.d tarball when i will use a liveusb quantal.

Revision history for this message
dino99 (9d9) wrote :

I'm back for more comments:

to get around that issue i've have found more new issues than solutions sadly. Specially a broken video process ( see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1019415/comments/5) not existing previously, so might be a regression introduce by some upgrades (all genuine).

But now i get an issue i never seen or heard before. Following the howto chroot (https://help.ubuntu.com/community/Grub2/Installing#ChRoot), and then wanted to reinstall grub as indicated, i've got a huge surprise:

grub installer print the error:

/usr/sbin/grub-bios-setup: warning: file system 'ext2' doesn't support embedding. .....

I cant believe than an ext4 partition is recognised now as ext2; but that explain all the errors posted into the duplicated reports.

So now we have to find which upgrade have been broken the partition(s) (i still need to check the other partitions but i suppose i will get the same reason).

Revision history for this message
dino99 (9d9) wrote :

Latest news:

- see attached required grub.d taball

- all the hdds tested without errors (fsck)
- still get "alloc magic is broken at ..." before seeing the grub menu

but strangely, compared to the error with grub-install reported into the #12 post above, updating then upgrading the latest grub packages was done without trouble. Note also than gparted confirme than none of the partitions are ext2, but as expected ext3/ext4 (curious).

As one of the pata hdd is only used for backup, and have some ntfs partitions only (plugged as primary master), i've tested to boot without that hdd (removed from the pc), and surprise, the pc boot normally: no more alloc magic broken . What is strange is these issues only have started on Sept 14, meaning the latest package upgrades have set a regression (now we need to find which one(s).
So now i have the other pata set as 1st master and the sata as 3d master and cdrom as 1st slave.

Revision history for this message
dino99 (9d9) wrote :

Well that issue is back again:

i've been able to boot only 1 time (see post above), then update & upgraded quantal i386 and finally logout/shutdown.
On next cold boot i get again "alloc magic is broken at ..." with each boot partition & the usblive stick.

and also affect at least one more user: http://ubuntuforums.org/showthread.php?p=12244024#post12244024

Revision history for this message
arpanaut (johnmac) wrote :

https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1051624
this bug is quite possibly related and may provide additional information to help sort this out.

Re: post #9 here, There's a series of increasingly-implausible lines of the form:
Yes, I a seeing this output also.

Revision history for this message
dino99 (9d9) wrote :

Sorry wrong url above (but grub related) http://ubuntuforums.org/showthread.php?t=2058884

Revision history for this message
dino99 (9d9) wrote :

My bad i should have noted that this new brakage has followed a :

sudo dpkg-reconfigure grub-pc

(as i hdd has been removed, but sadly it also brake again (i should think twice before doing so stupid things))

Maybe the best thing is to update a new grub2.0.is.1.99.21 or so until a real fix.

Revision history for this message
Stephen Cradock (s-cradock) wrote :

A bit more information that may help - I have similar problems to some extent.

For me they came as increasingly bizarre grub.cfg entries such as Colin quoted, and ended with "alloc magic is broken at..."
At that point I could not get the list of bootable installs, so not able to boot at all.

I booted from a Precise live CD and tried moving grub.cfg files from other places, but no luck. I tried adding a custom_menu cfg file, but the same error showed up and I never got to see the custom menu. Eventually I chrooted into an old Mint 12 install and got it set up as the grub master. I found that running update-grub failed to bring the grub.cfg entries up to date, so had to fix them by hand. All sorts of bad info - kernel versions, UUIDs, had to be edited.

Revision history for this message
dino99 (9d9) wrote :

i'm booted again with Precise:

- finally found that only my sata hdd can boot with the appropriate bios settings (not the ones previously used): only ahci works
- but the pata hdd still refuse and i get "alloc magic is broken at ..." I've tested that disk and there is no error or warning (fsck & smart)
- thats is curious as some users with pata hdd have no problem: so it might be all about chipsets differences. Could it be related to ram ? ram test passed here.

So now i will purge grub2.0 everywhere and reinstall only the precise 1.99~21ubuntu3.1 for now. I'm tired of the grub design when having a multi boot: you simply cant boot if grub is broken on the master partition (here too a grub failsafe, aka a second working partition choice should help a lot)

Revision history for this message
Colin Watson (cjwatson) wrote :

I would really appreciate it if we could keep bugs to a single problem each. The more you all follow up to this with multiple different problems, the harder it becomes for me to keep track, and the lower the probability of me being able to keep track well enough to fix anything.

I'll answer a couple of specific points here, but I won't follow up on them further; if you want to record these as things we need to fix, please file separate bugs.

  /usr/sbin/grub-bios-setup: warning: file system 'ext2' doesn't support embedding.

Unlike Linux, GRUB has a single filesystem driver for all of ext2, ext3, and ext4, which is simply called "ext2". The above message does NOT indicate that it has incorrectly recognised ext4 as ext2. It's just its name for its driver for all such filesystems.

The reports of alloc magic being broken might be a consequence of the overly-long menu entry strings here, but they might also be a consequence of confusion induced by different versions of GRUB in your multiple-disk setup. This is similar to the basic problem outlined here:

  http://www.chiark.greenend.org.uk/ucgi/~cjwatson/blosxom/debian/2010-06-21-grub2-boot-problems.html

In short, if the core image installed on the MBR or a partition boot record does not match the modules in /boot/grub, you're going to have problems. Try to avoid setups that could lead to this. Running 'sudo dpkg-reconfigure grub-pc' and selecting the MBRs of all non-removable disks can help, but in general if you have multi-disk and multi-OS setups then you need to be very clear about which one owns the boot process. Based on my experience with many such upgrades, there is a good chance that this is not a regression from 1.99 at all; it's simply been exposed by any non-trivial change to the version of GRUB.

From here on in, I am not going to address these problems any further in this bug; I'm focusing solely on the incorrect menu entry generation, which is the same as arpanaut's report in bug 1051624.

Revision history for this message
dino99 (9d9) wrote :

Thanks Colin for explanations, and agree with that single entry generation problem.

Revision history for this message
Colin Watson (cjwatson) wrote :

The problems here appear to stem from the terrible parsing in os-prober's grub2 handler. Fixing this is complex, but should be doable. However, I will need to think about some other practicalities as well. Since we may well need to cope with current grub.cfg being parsed by old os-prober on other systems, it may be necessary to contort the syntax a little so that it doesn't confuse older versions of os-prober quite so much.

Changed in os-prober (Ubuntu):
status: New → Triaged
importance: Undecided → High
assignee: nobody → Colin Watson (cjwatson)
Revision history for this message
dino99 (9d9) wrote :

Right now i've made my own choive to avoid complications: create a dedicated boot partition. So both bios and uefi could follow the same installation way. Indeed i now will be only using a single grub version.

http://ubuntuforums.org/showpost.php?p=8285027&postcount=4
https://help.ubuntu.com/community/BootPartition

Revision history for this message
Colin Watson (cjwatson) wrote :

I've spent a bit of time trying to figure out such a contortion. Unfortunately, the only ones I can think of that preserve the modern feature set will also confuse older versions of os-prober in different ways. I think we'll just have to backport a patch for this as far as we can, and you'll just have to upgrade os-prober on all systems where update-grub is run.

Revision history for this message
dino99 (9d9) wrote :

In fact maybe you can only deal with events in case a possible conflict appear with oldish versions.

Revision history for this message
Colin Watson (cjwatson) wrote : Re: [Bug 1050774] Re: grub-script-check assert failure: *** glibc detected *** /usr/bin/grub-script-check: double free or corruption (!prev): 0x08210b38 ***

Please expand on that? I'm not sure what you mean.

summary: - grub-script-check assert failure: *** glibc detected *** /usr/bin/grub-
- script-check: double free or corruption (!prev): 0x08210b38 ***
+ os-prober misparses GRUB 2.00 configuration, leading to arbitrarily-long
+ grub.cfg output lines
Revision history for this message
dino99 (9d9) wrote :

**** Please expand on that? I'm not sure what you mean. *****

Thanks to your blog for clear explanations about possible versions conflicts that can happen now or later, depending of the multiple upgrading scenari.
What i mean is, trying to find the easiest way to avoid conflict, to detect non compatible version (by scanning) before accepting the upgrade. To do that you need to list all the possible conflicts with a module or an other (that im calling "events" errors). Indeed its an shorterm embedded workaround, independant of the main problem repoted here.
But for future as a bootloader is a vital piece of a system, it might be as independant as possible of other part of the system, because of the actual multiple possible version conflicts. And a dedicated boot partition seems ideal, but its an other story.

Additional note:
 i've loaded Lubuntu Quantal i386, made the available upgrades proposed, then installed 3.6-rc6 kernel with dpkg: as a result i got the actual issue reported.
Then i've booted into ubuntu quantal i386, also made the upgrades, and finally installed 3.6-rc6 kernel via dpkg: and i did not get the script error issue. Strange.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package os-prober - 1.56ubuntu1

---------------
os-prober (1.56ubuntu1) quantal; urgency=low

  * Resynchronise with Debian (LP: #1050774). Remaining changes:
    - Mount btrfs subvolume @ when present to access a btrfs formatted
      rootfs.
    - When called with WINOSDATA, return the list of Windows partitions
      containing the system instead of only listing these containing the
      bootrecord.

os-prober (1.56) unstable; urgency=low

  [ Hedayat Vatankhah ]
  * Add support for probing Fedora's location for the GRUB 2 configuration
    file (closes: #674560).

  [ Colin Watson ]
  * Fix the parsing code in the grub2 handler so that it no longer gets
    hopelessly confused by multiple single-quoted strings on the same line,
    as produced by GRUB 2.00.

os-prober (1.55) unstable; urgency=low

  * Improve detection of Haiku: detect the 64-bit version
    Closes: #685228
  * Add myself to Uploaders
 -- Colin Watson <email address hidden> Tue, 18 Sep 2012 00:05:11 +0100

Changed in os-prober (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Colin Watson (cjwatson) wrote : Re: [Bug 1050774] Re: os-prober misparses GRUB 2.00 configuration, leading to arbitrarily-long grub.cfg output lines

Thanks, I understand what you mean now. However, while it would be
nice, GRUB does not currently have a compact encoding of its module ABI
that would allow us to perform such a scan even slightly accurately.

With multiple quantal installations on the same system, since they refer
to each other in their GRUB configuration, I suspect it may take a
couple of alternate runs of update-grub in each installation after
installing my os-prober fix to clear everything out. You might have a
look at grub.cfg after each run to see if you can recognise the
weirdness from this bug.

Revision history for this message
dino99 (9d9) wrote :

Well i've tested the following:
- latest grub2.0 on 2 quantal: ubuntu & lubuntu i386
- also added from your grub ppa on 2 precise i386: but got on both an issue with apport, need either to refuse the grub2.0 upgrading due to apport removal, or accept the upgrading but then apport cant be reinstalled.
- as i prefer to avoid the possible os-prober version issue, all 4 installs now have grub2.0

In such case, grub boot smootly, excepting the common black screen before getting the menu (mainly 3 seconds).
So i hope either for a grub2.0 backport to Precise or an updated grub2.0 version into the ppa, to be synced with quantal grub version.

Colin Watson (cjwatson)
Changed in grub2 (Ubuntu):
assignee: Colin Watson (cjwatson) → nobody
Revision history for this message
dino99 (9d9) wrote :

That issue is quite old now, and the os-prober fix seems to have help a lot. Considering to close it.

Changed in grub2 (Ubuntu):
status: Triaged → Incomplete
dino99 (9d9)
Changed in grub2 (Ubuntu):
status: Incomplete → Opinion
status: Opinion → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.