when installer is multipath aware, grub fails to install

Bug #687501 reported by Peter Petrakis on 2010-12-08
38
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OEM Priority Project
High
Unassigned
grub2 (Ubuntu)
High
Unassigned
Lucid
High
Unassigned
Maverick
High
Unassigned

Bug Description

Binary package hint: debian-installer

This was discovered on the Intel IMS which uses a MP SAS Array
in ALUA mode, Active/Standby.

To reproduce:
1) ubuntu 10.4.1 server amd64
2) grub prompt# install disk-detect/multipath/enable=true
3) It doesn't matter how you partition the disks or if one or both paths are up.

Failure:

grub bails out, logs attached
Dec 9 03:25:45 debconf: --> SUBST grub-installer/progress/step_install_loader BOOTDEV /dev/sda
Dec 9 03:25:45 debconf: Adding [BOOTDEV] -> [/dev/sda]
Dec 9 03:25:45 debconf: <-- 0
Dec 9 03:25:45 debconf: --> PROGRESS INFO grub-installer/progress/step_install_loader
Dec 9 03:25:45 debconf: <-- 0 OK
Dec 9 03:25:45 grub-installer: info: Installing grub on '/dev/sda'
Dec 9 03:25:45 grub-installer: info: grub-install supports --no-floppy
Dec 9 03:25:45 grub-installer: info: Running chroot /target grub-install --no-floppy --force "/dev/sda"
Dec 9 03:25:45 grub-installer: /usr/sbin/grub-probe: error:
Dec 9 03:25:45 grub-installer:
Dec 9 03:25:45 grub-installer: cannot find a GRUB drive for /dev/mapper/mpath0p1. Check your device.map
Dec 9 03:25:45 grub-installer: .
Dec 9 03:25:45 grub-installer: You have a memory leak (not released memory pool):
Dec 9 03:25:45 grub-installer:
Dec 9 03:25:45 grub-installer: [0x25f8180]
Dec 9 03:25:45 grub-installer:
Dec 9 03:25:45 grub-installer: Auto-detection of a filesystem module failed.
Dec 9 03:25:45 grub-installer: Please specify the module with the option `--modules' explicitly.
Dec 9 03:25:45 grub-installer: error: Running 'grub-install --no-floppy --force "/dev/sda"' failed.
Dec 9 03:25:45 debconf: --> SUBST grub-installer/grub-install-failed BOOTDEV /dev/sda
Dec 9 03:25:45 debconf: Adding [BOOTDEV] -> [/dev/sda]
Dec 9 03:25:45 debconf: <-- 0
Dec 9 03:25:45 debconf: --> INPUT critical grub-installer/grub-install-failed

Peter Petrakis (peter-petrakis) wrote :
Peter Petrakis (peter-petrakis) wrote :

I've tried multiple partition layouts, from the guided do it for me to
creating a separate /boot. Still fails.

The "grub-installer" from the shell cli isn't very helpful either, it dies.

Wrong number of args: mapdevfs <path>

Peter Petrakis (peter-petrakis) wrote :
Changed in debian-installer (Ubuntu):
status: New → Confirmed
Steve Magoun (smagoun) on 2010-12-08
Changed in oem-priority:
importance: Undecided → High
Peter Petrakis (peter-petrakis) wrote :

So the only way to get around this is to down the standby path and install.
Then the customer would have to install multipath-tools/boot, reconfigure
their fstab to use scsi ids or /dev/mapper/ paths, pick their device handler,
rebuild the initramfs and update grub as well.

Lot's of room for error here, even I mess it up on occasion.

Peter Petrakis (peter-petrakis) wrote :

This bug might be related, a grub issue with fakeraid, which utilizes device-mapper.

https://bugs.launchpad.net/ubuntu/+source/grub-installer/+bug/603854

Changed in debian-installer (Ubuntu):
importance: Undecided → High
Colin Watson (cjwatson) on 2010-12-10
affects: debian-installer (Ubuntu) → grub-installer (Ubuntu)
Peter Petrakis (peter-petrakis) wrote :

I had first thought that it was the package installation order of multipath-tools/boot
and the post configuration that sometimes resulted in an unbootable system was user
error, this is not the case. It's in fact another bug, with update-grub, which ruins
the grub.cfg when we're actually using MP devices for our root.

https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/688261

So basically, the next time the kernel is updated or the user changes
the grub args will result in a brick.

Robbie Williamson (robbiew) wrote :

Trying to understand if I need to assign this bug or not. Going off comment #6, can we just dup this one to 688261? Or would you prefer to wait until that bug is resolved, to see if this one is then fixed?

Peter Petrakis (peter-petrakis) wrote :

@Robbie,

We're probably going to have to rebuild the installer to make the new changes
relevant, so it would be useful to leave this issue open to track that issue
and whatever else might spring up.

Colin Watson (cjwatson) wrote :

I'm looking at this in bug 688261.

Colin Watson (cjwatson) wrote :

Peter, can't you use extra tasks on bug 688261 for your tracking? I find that multiple bugs for a single issue are a drain on mental resources.

affects: grub-installer (Ubuntu) → grub2 (Ubuntu)
Colin Watson (cjwatson) wrote :

This was fixed recently in Natty by merging the upstream 'devmapper' branch.

I suspect this will be pretty intrusive to merge into lucid (and possibly maverick), though, since so much else has changed. I'll try to come up with a reduced fix.

Changed in grub2 (Ubuntu):
status: Confirmed → Fix Released
Changed in grub2 (Ubuntu Lucid):
status: New → Triaged
Changed in grub2 (Ubuntu Maverick):
status: New → Triaged
Changed in grub2 (Ubuntu Lucid):
importance: Undecided → High
Changed in grub2 (Ubuntu Maverick):
importance: Undecided → High
Colin Watson (cjwatson) wrote :

Actually, it may not be too awful a backport. That would be nice.

Colin Watson (cjwatson) wrote :

I've uploaded a proposed fix (1.98-1ubuntu11~ppa1) to a PPA (https://launchpad.net/~cjwatson/+archive/grub). It should be possible to install from this using the following untested preseed file:

  d-i apt-setup/local0/repository string deb http://ppa.launchpad.net/cjwatson/grub/ubuntu lucid main
  d-i apt-setup/local0/key string http://keyserver.ubuntu.com:11371/pks/lookup?op=get&search=0x6A547A8B977102C0

This will probably only work for netboot installations. I'd appreciate feedback. Thanks!

Changed in grub2 (Ubuntu Lucid):
status: Triaged → In Progress
Serge Hallyn (serge-hallyn) wrote :

grub-probe in the proposed package manages to actually use multipath's /dev/mapper/* files. Comparing /boot/grub/grub.cfg from this grub package to the one using blkid in place of grub-probe produces the following diff:

root@kickseed:/root# diff grub.cfg.new grub.cfg.serge
58,59c58
< insmod ext2
< set root='(hd5,1)'
---
> insmod ext4
66,67c65
< insmod ext2
< set root='(hd5,1)'
---
> insmod ext4
76,77c74
< insmod ext2
< set root='(hd5,1)'
---
> insmod ext4
84,85c81
< insmod ext2
< set root='(hd5,1)'
---
> insmod ext4
96,97c92
< insmod ext2
< set root='(hd5,1)'
---
> insmod ext4
102,103c97
< insmod ext2
< set root='(hd5,1)'
---
> insmod ext4

I'm having some trouble getting reliable good boots on the system, but I don't think that's due to this grub package. So as long as having grub.cfg call an ext4 fs 'ext2' is ok (and adding the 'set root=' line, for some reason) I would call this confirmed.

Thanks for testing!

'insmod ext4' would be wrong - GRUB's ext2 module handles all of ext2,
ext3, and ext4, and it doesn't have a separate ext4 module. I suspect
that this emitted error messages that you didn't notice, but worked
anyway because ext2 was built into the core image (at a guess).

Adding 'set root=' is indeed normal - grub-mkconfig generates both
traditional 'set root=' lines and modern 'search' lines. We keep the
former basically as insurance.

Serge Hallyn (serge-hallyn) wrote :

Thanks, Colin. Then this package worked correctly. Thanks very much for fixing grub-probe!

Is there any chance of this getting SRU'd to lucid? :) I assume the code has changed way too much...

Colin Watson (cjwatson) wrote :

Actually I prepared the patch specifically with the aim of getting it
through SRU ... I'm too late for 10.04.2 though, so I put it in a PPA
until 10.04.2 releases and I can upload things for 10.04.3.

Colin Watson (cjwatson) wrote :

I've now uploaded the lucid fix to lucid-proposed, and have also uploaded a version of this fix for maverick to the same PPA (which should build shortly).

Clint Byrum (clint-fewbar) wrote :

APPROVED: After review of the package version 1.98-1ubuntu11 uploaded to lucid-proposed, it should be accepted.

Accepted grub2 into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in grub2 (Ubuntu Lucid):
status: In Progress → Fix Committed
tags: added: verification-needed
Colin Watson (cjwatson) on 2011-04-05
Changed in grub2 (Ubuntu Lucid):
milestone: none → ubuntu-10.04.3
dann frazier (dannf) wrote :

I reproduced the problem following Peter's reproduction steps. I then rebooted w/ apt-setup/proposed=true (in addition to disk-detect/multipath/enable=true), and the install was successful.

Leonardo Borda (lborda) wrote :

Hi Colin,

I have a customer experiencing the same issue although it is not during the installation. It happens when the customer tries to run aptitude upgrade in lucid.
I tested the ppa and according to the customer it works, grub2 installed with no errors.

Here there is some info:
-------- lsb_release -d --------
Ubuntu 10.04.2 LTS
-------- uname -a --------
Linux xxxxxx 2.6.32-27-server #49-Ubuntu SMP Thu Dec 2 02:05:21 UTC 2010 x86_64 GNU/Linux

Errors on aptitude upgrade.

======
Log started: 2011-04-07 16:34:18
Setting up linux-image-2.6.32-29-server (2.6.32-29.58) ...

Running depmod.

update-initramfs: Generating /boot/initrd.img-2.6.32-29-server

W: Possible missing firmware /lib/firmware/ql8100_fw.bin for module qla2xxx

Running postinst hook script /usr/sbin/update-grub.

Generating grub.cfg ...

/usr/sbin/grub-probe: error: cannot find a GRUB drive for /dev/mapper/mpath2-part2. Check your device.map.

You have a memory leak (not released memory pool):

 [0x2254f90]

User postinst hook script [/usr/sbin/update-grub] exited with value 1

dpkg: error processing linux-image-2.6.32-29-server (--configure):

 subprocess installed post-installation script returned error exit status 1

Setting up linux-image-2.6.32-30-server (2.6.32-30.59) ...

Running depmod.

update-initramfs: Generating /boot/initrd.img-2.6.32-30-server

W: Possible missing firmware /lib/firmware/ql8100_fw.bin for module qla2xxx

Running postinst hook script /usr/sbin/update-grub.

Generating grub.cfg ...

/usr/sbin/grub-probe: error: cannot find a GRUB drive for /dev/mapper/mpath2-part2. Check your device.map.

You have a memory leak (not released memory pool):

 [0x1143f90]

User postinst hook script [/usr/sbin/update-grub] exited with value 1

dpkg: error processing linux-image-2.6.32-30-server (--configure):

 subprocess installed post-installation script returned error exit status 1

Setting up grub-pc (1.98-1ubuntu10) ...

Generating grub.cfg ...

/usr/sbin/grub-probe: error: cannot find a GRUB drive for /dev/mapper/mpath2-part2. Check your device.map.

You have a memory leak (not released memory pool):

 [0x2449f90]

dpkg: error processing grub-pc (--configure):

 subprocess installed post-installation script returned error exit status 1

dpkg: dependency problems prevent configuration of linux-image-server:

 linux-image-server depends on linux-image-2.6.32-30-server; however:

  Package linux-image-2.6.32-30-server is not configured yet.

dpkg: error processing linux-image-server (--configure):

 dependency problems - leaving unconfigured

dpkg: dependency problems prevent configuration of linux-server:

 linux-server depends on linux-image-server (= 2.6.32.30.36); however:

  Package linux-image-server is not configured yet.

dpkg: error processing linux-server (--configure):

 dependency problems - leaving unconfigured

Errors were encountered while processing:

 linux-image-2.6.32-29-server

 linux-image-2.6.32-30-server

 grub-pc

 linux-image-server

 linux-server

Log ended: 2011-04-07 16:49:40

Leonardo Borda

Clint Byrum (clint-fewbar) wrote :

Accepted grub2 into maverick-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in grub2 (Ubuntu Maverick):
status: Triaged → Fix Committed

On Mon, Apr 25, 2011 at 06:01:14PM -0000, Clint Byrum wrote:
> Accepted grub2 into maverick-proposed, the package will build now and be
> available in a few hours. Please test and give feedback here. See
> https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to
> enable and use -proposed. Thank you in advance!

I reproduced the original issue with pristine maverick (server-amd64)
and verified that it has gone away when using -proposed. Thanks!

tags: added: verification-done
removed: verification-needed
Colin Watson (cjwatson) wrote :

Just a note to archive admins: please don't promote this to -updates yet, as in the case of Maverick it will cause problems for Wubi users. I've agreed with Martin that we'll add some Wubi fixes to -proposed at the same time; I've done this for Lucid, and am going to be testing this on Maverick today in preparation for an upload.

Martin Pitt (pitti) wrote :

Note: I mark this as verification-failed for the time being to prevent accidental premature copying to -updates. Colin, please feel free to remove the tag again once it's ready to go.

I keep v-done, the SRU report tracker will do the right thing here (-failed wins over -done).

tags: added: verification-failed

Accepted grub2 into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

tags: removed: verification-done verification-failed
tags: added: verification-needed
Martin Pitt (pitti) wrote :

Accepted grub2 into maverick-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Leonardo Borda (lborda) wrote :

Hi
Lucid-proposed packages also works for me.

Thank you.

Martin Pitt (pitti) wrote :

Colin, is that subsequent upload the one you were waiting for for promoting this to -updates? As I'm not sure I reapplied the v-failed tag (sru-accept resets the tags, sorry).

tags: added: verification-done-lucid verification-failed
Colin Watson (cjwatson) wrote :

That subsequent grub2 upload was, but it turns out it needed a lupin upload as well which I forgot. That's currently waiting for approval.

Colin Watson (cjwatson) wrote :

The subsequent grub2 and lupin uploads have been verified now, so setting back to v-done.

tags: added: verification-done
removed: verification-done-lucid verification-failed verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 1.98-1ubuntu12

---------------
grub2 (1.98-1ubuntu12) lucid-proposed; urgency=low

  * Fix use of freed memory when replacing existing loopback device
    (LP: #742967).
  * Make sure to reinstall GRUB on upgrade if Wubi is in use (LP: #742967).
  * Enable grub-fstest, so that we can use it to help find wubildr
    (LP: #610898).
  * Make NTFS UUIDs uppercase (LP: #695290).

grub2 (1.98-1ubuntu11) lucid-proposed; urgency=low

  * Backport multipath probing fixes (LP: #687501).
 -- Colin Watson <email address hidden> Wed, 27 Apr 2011 10:39:15 +0100

Changed in grub2 (Ubuntu Lucid):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 1.98+20100804-5ubuntu3.3

---------------
grub2 (1.98+20100804-5ubuntu3.3) maverick-proposed; urgency=low

  * Fix use of freed memory when replacing existing loopback device
    (LP: #742967).
  * Make sure to reinstall GRUB on upgrade if Wubi is in use (LP: #742967).
  * Enable grub-fstest, so that we can use it to help find wubildr
    (LP: #610898).
  * Make NTFS UUIDs uppercase (LP: #695290).

grub2 (1.98+20100804-5ubuntu3.2) maverick-proposed; urgency=low

  * Backport multipath probing fixes (LP: #687501).
 -- Colin Watson <email address hidden> Wed, 27 Apr 2011 10:32:26 +0100

Changed in grub2 (Ubuntu Maverick):
status: Fix Committed → Fix Released
Chris Van Hoof (vanhoof) on 2011-06-20
Changed in oem-priority:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers