grub-pc fails to boot (system resets after GRUB prompt) on degraded RAID

Bug #1073108 reported by Cédric Dufour on 2012-10-30
34
This bug affects 6 people
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
Medium
Louis Bouchard
Precise
Medium
Louis Bouchard

Bug Description

[SRU justification]
System does not boot in degraded mode on second disk. This is systematic when system is localized in french (fr_FR.UTF-8)

[Impact]
Unable to boot system in degraded mode

[Fix]
Backport uptream fix :
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=626853

[Test Case]
- Install system with system disk as RAID1 with french localization.
- Remove first disk (/dev/sda) from the system
- Boot the system.

Without the patch, grub will loop forever and the system will not boot.
With the patch, the system boots normally.

[Regression]
None expected. This fix is already in present in Debian Wheezy, Ubuntu Trusty and Utopic.

[Description of the problem]

When a RAID1 system is degraded (e.g. unplug /dev/sdb, leaving obnly /dev/sda), grub-pc properly enters prompt but system is reset as soon as one attemps to boot OR one enters the command-line and issue the 'ls' command.

This bug is known (and presumably fixed) to Debian: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=626853
Shortly put: adding the 'GRUB_TERMINAL=console' to '/etc/default/grub' (and executing 'update-grub' afterwards) allows to circumvent the issue (verified on Ubuntu/Quantal 12.10).

I believe this bug is critical because it shows up only when a disk fails but remains latent on a healthy system.

Cheers

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in grub2 (Ubuntu):
status: New → Confirmed
Erik Stian Tefre (erik-d) wrote :

Reproduced both on physical hardware and in virtualbox on 12.04.2 LTS. The physical hardware reboots/resets after the grub menu. The virtual machine triggers a guru meditation in virtualbox instead of the reboot/reset.

The workaround in the bug description works as advertised.

My (limited) testing of the issue indicates that drive 0 has to be yanked to reproduce the bug. Yanking drive 1 has no effect on bootability.
[_U] : Doesn't boot.
[U_] : Boots.

The system boots just fine if the degraded drive is connected to the system at boot time. It has to be yanked (or broken for real) to trigger the bug.

Good idea, thix Peter.

Frédéric Masi - Technical Account Manager

On 23/07/13 17:23, Launchpad Bug Tracker wrote:
> You have been subscribed to a public bug by Peter Matulis (petermatulis):
>
> When a RAID1 system is degraded (e.g. unplug /dev/sdb, leaving obnly
> /dev/sda), grub-pc properly enters prompt but system is reset as soon as
> one attemps to boot OR one enters the command-line and issue the 'ls'
> command.
>
> This bug is known (and presumably fixed) to Debian: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=626853
> Shortly put: adding the 'GRUB_TERMINAL=console' to '/etc/default/grub' (and executing 'update-grub' afterwards) allows to circumvent the issue (verified on Ubuntu/Quantal 12.10).
>
> I believe this bug is critical because it shows up only when a disk
> fails but remains latent on a healthy system.
>
> Cheers
>
> ** Affects: grub2 (Ubuntu)
> Importance: Undecided
> Status: Confirmed
>

Louis Bouchard (louis) on 2014-09-17
Changed in grub2 (Ubuntu):
importance: Undecided → Medium
assignee: nobody → Louis Bouchard (louis-bouchard)
Changed in grub2 (Ubuntu Precise):
importance: Undecided → Medium
assignee: nobody → Louis Bouchard (louis-bouchard)
status: New → In Progress
Louis Bouchard (louis) wrote :

Marking dev released as Invalid since the upstream fix is already upstream

Changed in grub2 (Ubuntu):
status: Confirmed → Invalid
Louis Bouchard (louis) on 2014-09-17
description: updated
Louis Bouchard (louis) wrote :
Colin Watson (cjwatson) wrote :

Makes sense, thanks!

Hello Cédric, or anyone else affected,

Accepted grub2 into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/grub2/1.99-21ubuntu3.17 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in grub2 (Ubuntu Precise):
status: In Progress → Fix Committed
tags: added: verification-needed
Louis Bouchard (louis) wrote :

Test case described above passes correctly.

tags: added: verification-done
removed: verification-needed

The verification of the Stable Release Update for grub2 has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 1.99-21ubuntu3.17

---------------
grub2 (1.99-21ubuntu3.17) precise; urgency=medium

  * Fix infinite recursion in gettext when translation fails
    (LP: #1073108)
 -- Louis Bouchard <email address hidden> Wed, 17 Sep 2014 13:42:52 +0100

Changed in grub2 (Ubuntu Precise):
status: Fix Committed → Fix Released
T. K. (stuttgart-live) wrote :

Last week I did a dist-upgrade on by Ubuntu 12.04 system (HP Microserver N54L, configured as RAID 1). It seems that this fix results in another bug or problem at least for me. At the end of the upgrade process a new grub.cfg is build with the following errors displayed:

...
Generating grub.cfg ...
Found linux image: /boot/vmlinuz-3.2.0-69-generic
Found initrd image: /boot/initrd.img-3.2.0-69-generic
Found linux image: /boot/vmlinuz-3.2.0-67-generic
Found initrd image: /boot/initrd.img-3.2.0-67-generic
Found memtest86+ image: /memtest86+.bin
ERROR: ddf1: seeking device "/dev/dm-2" to 18446744073709421056
ERROR: hpt37x: seeking device "/dev/dm-2" to 4608
ERROR: hpt45x: seeking device "/dev/dm-2" to 18446744073709547008
ERROR: isw: seeking device "/dev/dm-2" to 18446744073708469760
ERROR: sil: seeking device "/dev/dm-2" to 18446744073709289984
ERROR: ddf1: seeking device "/dev/dm-2" to 18446744073709421056
ERROR: hpt37x: seeking device "/dev/dm-2" to 4608
ERROR: hpt45x: seeking device "/dev/dm-2" to 18446744073709547008
ERROR: isw: seeking device "/dev/dm-2" to 18446744073708469760
ERROR: sil: seeking device "/dev/dm-2" to 18446744073709289984
done
Trigger für libc-bin werden verarbeitet ...
ldconfig deferred processing now taking place

Now I fear that rebooting my system can fail.

Attached you find a file with more information about the update, dist-upgrade, dpkg.log, syslog and lvm configuration.
Please ask for more information if needed . Thank you.

Louis Bouchard (louis) wrote :

Unfortunately, this seems to come from the post-processing of the grub installation process. The fix only address one of the internal function of grub.

I suggest to open a new bug. This is a side effect of the upgrade

T. K. (stuttgart-live) wrote :

Louis, I've opened this new bug: https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1377826. Can you confirm this issue, please? Thank you.

Another question: What do you think, is it save to reboot my server with these errors generating grub.cfg?

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.