GRUB installation fails if installing to certain non-ext3 filesystems

Bug #185878 reported by Mahmoud H. Al-Qudsi on 2008-01-25
408
This bug affects 68 people
Affects Status Importance Assigned to Milestone
grub (Debian)
New
Unknown
grub (Ubuntu)
High
Colin Watson
Hardy
High
Steve Langasek

Bug Description

Just like Bug #14010 but not just for JFS.

GRUB will fail to install during setup when it is being installed to a partition (as opposed to a boot sector) formatted as anything but ext2/3fs, including JFS and ReiserFS.

TEST CASE:
1. use the ubuntu 8.04.3 alternate installer in expert mode to configure your root filesystem as JFS
2. tell grub to install to the root partition instead of to the disk.
3. confirm that grub installation fails.
4. download the ubuntu 8.04.4 alternate installer candidate from http://cdimage.ubuntu.com/hardy/daily/current/
5. use the 8.04.4 candidate in expert mode to configure your root filesystem as JFS
6. tell grub to install to the root partition instead of to the disk.
67. verify that grub installation succeeds.

Mahmoud H. Al-Qudsi (mqudsi) wrote :

Confirmed to happen whenever GRUB is being installed to other than the bootsector and the / filesystem is not ext3fs.

Mike Perry (mike.perry) wrote :

Have you tried breaking out your partitions so the /boot is ext2/3 and separate from /.

Example:
/dev/sda1 /boot <ext3>
/dev/sda2 swap
/dev/sda3 / <jfs>

Mahmoud H. Al-Qudsi (mqudsi) wrote :

Mike, under the scenario you've described, there is no problem installing GRUB.

It only fails if the call to grub-install() specifies a non-ext3fs partition; and if /boot/ is ext3fs, that's a moot point.

Still a bug though..

Mike Perry (mike.perry) wrote :

I completely agree it is a bug. I just wanted to pose a work around for you to get your machine going :)

Mahmoud H. Al-Qudsi (mqudsi) wrote :

Thanks :-)

zebul666 (zebul666) wrote :

is bug #123102 a duplicate of this one ?

Mahmoud H. Al-Qudsi (mqudsi) wrote :

Yes. Marked as such.

This is the more-general (read: actual problem) case for bug 185878

zebul666 (zebul666) wrote :

grub-install failed here again with 8.10. when this bug will ever be closed and FIXED??!! why other distro can install grub in any partition and not ubuntu ??? if not say it in the installer !! damn it !

i tried to install grub in /dev/sda8 with a reiser fs. i guess that's still the same bug

awe_cz (martin-nedbal) wrote :

Hi,
same here. I was installing -server edition on new Lenovo TD100 ThinkServer machine. Originally I thought it was caused by RAID controller but obviously the problem is somewhere else. From my experience it's not limited to case with non-ext3 root partition. I had the same issue when I left the installer to partition my drives automatically (thus with ext3 root partition). In all the cases grub code was not installed into MBR *at all*. Being desperate I finally found out by dumping MBR...

It's broken completely, even if you go back at the end of installation wizard and explicitly run "install grub" option it's not installed. The only working solution is to run terminal, chroot to /target and run grub-install manually.

Colin Watson (cjwatson) on 2009-04-20
summary: - GRUB Installation Fails if non-ext3 Root Partition
+ GRUB installation fails if installing to non-ext3 partition
description: updated

Is this 4+ years old Ubuntu-specific nuisance responsible as well, for what is happening here?

from my failed Jaunty Desktop installation trials Syslog (--> Bug #369048):

Apr 29 04:55:16 ubuntu grub-installer: grub> setup --stage2=/boot/grub/stage2 --prefix=/boot/grub (hd0,8)
Apr 29 04:55:16 ubuntu grub-installer: Checking if "/boot/grub/stage1" exists... yes
Apr 29 04:55:16 ubuntu grub-installer: Checking if "/boot/grub/stage2" exists... yes
Apr 29 04:55:16 ubuntu grub-installer: Checking if "/boot/grub/jfs_stage1_5" exists... yes
Apr 29 04:55:16 ubuntu grub-installer: Running "embed /boot/grub/jfs_stage1_5 (hd0,8)"... 18 sectors are embedded.
Apr 29 04:55:16 ubuntu grub-installer: succeeded
Apr 29 04:55:16 ubuntu grub-installer: Running "install --stage2=/boot/grub/stage2 /boot/grub/stage1 (hd0,8) (hd0,8)1+18 p (hd0,8)/boot/grub/stage2 /boot/grub/menu.lst"... failed
Apr 29 04:55:16 ubuntu grub-installer:
Apr 29 04:55:16 ubuntu grub-installer: Error 6: Mismatched or corrupt version of stage1/stage2
Apr 29 04:55:16 ubuntu grub-installer: grub> quit
Apr 29 04:55:16 ubuntu grub-installer: error: Running 'grub-install --no-floppy "/dev/sda9"' failed.

The designated partitions (all extended) have been created from OS/2 in order to avoid OS/2 LVM and boot manager issues.
/home =1GB, / =4GB, /swap =3.5GB.
I then pointed the installer to use sda8 as /home, sda9 as / and sda10 as /swap and also format the first 2 as JFS.
I further chose in the Extended, Enhanced, Advanced or whatever the exact name of this option is, to put the initial GRUB boot-loader code into the PBR of sda9, in another installation trial into sda8, but neither worked.

The installation, no matter if started from Gnome via the icon or already at the beginning from the Desktop-CD's boot-options, always failed in the same way and I ended up with the Live-CD Desktop.

If I'm not mistaken, there then should be /target/boot/grub, however /grub doesn't exist there.

Colin Watson (cjwatson) on 2009-05-14
Changed in grub-installer (Ubuntu):
importance: Undecided → High
status: Confirmed → Triaged
OS/2-User (fzf7a2c02) wrote :

Colin, I really do appreciate you picking this up now. If there is anything else you need to know or have me test, please don't hesitate to ask.

This said, is what Henning says in https://bugs.launchpad.net/ubuntu/+source/ubiquity/+bug/123102/comments/5:

"But, very interesting: After closing the (unfinished) installer, the target system is still mounted on /target. If i execute the command
      sudo chroot /target grub-install --no-floppy --debug "(hd0,4)"
manually (only --debug added), it succeeds with this messages:"

a functional workaround, once the installer aborted and I find myself back on the Desktop or are there then other operations/configurations from the aborted installation still missing?

OS/2-User (fzf7a2c02) wrote :

I have now tried what Henning did (and just like him added "--debug"), running from a GTerminal the last failed command listed in /var/log/user.log:
   sudo chroot /target grub-install --no-floppy --debug "/dev/sda9":
which now seems to have finished just fine and I didn't notice any error messages.

But something is still missing or not correctly configured.
When I boot up and select in my BM the partition with the initial GRUB bootloader code in its PBR, I'm ending up in a GRUB shell, rather than GRUB loading Jaunty's Linux kernel.

So what else do I need to do/check for/provide info for, in order to finally have Jaunty to start up?

specialk (geigerkr) wrote :

Hi, folks. I just spent the past 3 days pulling my hair out with installations and am happy to find this page. I recently tried a JFS partition and got wise to turn tail the syslog during installation. I didn't confirm during my XFS installations but I bet if I had the energy to confirm the output it would be similar. I confirm that I can also run the command manually to completion. I hope not to get the dreaded CRC error I was getting after my XFS attempts when I reboot this time.

Linux ubuntu 2.6.28-11-generic #42-Ubuntu SMP Fri Apr 17 01:58:03 UTC 2009 x86_64 GNU/Linux

Is there a fix pending? Sounds like from what I read this is quite an old bug? I probably worked around this issue somehow the last time I installed Ubuntu about a year ago or so, but I repressed those memories.

Thanks.

specialk (geigerkr) wrote :

Why is this problem not fixed? Do I have to switch distros to get non ext support? This is crazy. I managed to install into XFS in prior versions of Ubuntu. What happened?

Brian Rogers (brian-rogers) wrote :

The next release will include GRUB2, which should support just about every filesystem.

Colin Watson (cjwatson) wrote :

Regardless of GRUB 2 (which cannot be assumed to fix this bug without checking; installation into a partition doesn't seem to be a particularly high-priority feature for GRUB upstream, from what I can tell), we should analyse and fix this one.

I've been looking into this today. It seems to be a race; running grub-install twice in succession, as previously observed, works fine, and one can reproduce the bug again just by zeroing out the region on disk occupied by stage 1.5. I think that pretty clearly makes this a grub bug rather than a grub-installer bug. I have some ideas on how to fix that and will try them out shortly.

specialk: Race conditions can be a bit like this; sometimes you just get lucky. I don't think anything specific was changed to break this, but some of the timings may have changed a bit. Of course, installing to the MBR or to some different partition will work fine even if /boot is on XFS.

affects: grub-installer (Ubuntu) → grub (Ubuntu)
Colin Watson (cjwatson) wrote :

I think I've nailed this. From what I can tell, it does *not* actually affect XFS (there may of course be a problem there, but if so it's a different bug), but only filesystem types where grub has explicit knowledge of how to embed its stage 1.5 into them. The complete list of these filesystems is FFS, JFS, ReiserFS, and UFS2.

The problem here is that writing to a partition device (e.g. /dev/sda1) when you still have the disk device (e.g. /dev/sda) open does not then guarantee that you'll get fresh data back when you read from /dev/sda. In order to guarantee this, grub needs to close and reopen the disk device, or (more easily) use the BLKFLSBUF ioctl.

OS/2-User (fzf7a2c02) wrote :

Well, all I can say is, that during all those years I experienced Ubuntu to fail to finish its installation, whenever I tried to get it on my HDD the way I needed it to be, SuSE didn't have that problem, using the very same conditions.
Out of desperation I now switched to Linux Mint, which is based on Ubuntu, but installs just fine, quite different to the original, it's based on. And as far as I know, SuSE as well as Mint use GRUB as well.
So for all I can tell, to me it therefore doesn't seem to be a problem of grub-install, but the installer itself, but I'm everything but an expert, so I may be completely wrong.
But I know for sure, that Ubuntu is severely broken since a very long time, when one tries to use any of those legitimate options the installer offers, other than MBR and ext[n].
If Ubuntu considers the only way it is supposed to be installed, to be Grub in the MBR and the only valid FS to be ext[n], then be honest and say so clearly from the get go and REMOVE all other options from the installer.
Else it is like Henry Ford back then, offering his car in every colour one desires, as long as it is black.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub - 0.97-29ubuntu56

---------------
grub (0.97-29ubuntu56) karmic; urgency=low

  * cache_coherency.diff: Flush buffer cache on disk devices after writing
    to any partition devices they contain, otherwise we may read old data
    back (LP: #185878).

 -- Colin Watson <email address hidden> Tue, 30 Jun 2009 14:46:51 +0100

Changed in grub (Ubuntu):
status: Triaged → Fix Released
zebul666 (zebul666) wrote :

"then be honest and say so clearly from the get go and REMOVE all other options from the installer"

I totally agree. and this should have been done two years ago...

Colin Watson (cjwatson) on 2009-06-30
summary: - GRUB installation fails if installing to non-ext3 partition
+ GRUB installation fails if installing to certain non-ext3 filesystems
OS/2-User (fzf7a2c02) wrote :

Colin, thank you very much for trouble shooting and finally coming up with a fix for this long existing Ubuntu specific nuisance :-)
Looks like while composing my last reply, overlapped with your investigative findings.
You have no idea how much I'm looking forward to the next release, which then I should be able to install Ubuntu for the very first time successfully here in my existing setup. :-) So again, a very big Thanks.

Colin Watson (cjwatson) wrote :

OS/2-User: Dude, I just fixed your bug. :-) It was not an installer issue, but a problem with grub itself. Without putting a lot of effort into archaeology, I don't know exactly why other systems didn't suffer from it, but perhaps they were using a slightly different patch set (GRUB Legacy is like this - every distribution ships quite significantly different versions of it, which is one reason why now we're trying to converge again on GRUB 2) or perhaps they just got lucky. There is no need to remove options from the installer since the grub bug is now *fixed*.

At any rate, could you stop having a go at me after I just spent several hours fixing this bug? I don't expect gratitude for doing my job, of course, and I realise it took much longer than it should have done for us to get round to fixing this, but you could at least be civil about it. Perhaps in future I will say nothing about my thought processes since it seems that the response to me being open about my progress over the course of today was just to be shouted at by you and zebul666; it makes me wonder why I bothered, frankly, if you're just going to harangue me for describing my progress on the fix.

I can't easily do very much about older releases, since the installer is already out the door and on physical media, but I've uploaded backports of this fix to https://launchpad.net/~cjwatson/+archive/ppa; if those test out OK, we may at least be able to get this into Ubuntu 8.04.4 (8.04.3 is a bit too close to be able to get anything else substantial into it).

Colin Watson (cjwatson) wrote :

... and apparently my response overlapped with your apology, so let's consider this done ;-)

Changed in grub (Ubuntu Hardy):
assignee: nobody → Colin Watson (cjwatson)
importance: Undecided → High
milestone: none → ubuntu-8.04.4
status: New → Triaged
Changed in grub (Ubuntu):
assignee: nobody → Colin Watson (cjwatson)
Colin Watson (cjwatson) wrote :

I just linked to a more or less identical Debian bug, thus proving that this wasn't Ubuntu-specific. :-) I'll send my patch there.

  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=362028

Changed in grub (Debian):
status: Unknown → New
Steve Langasek (vorlon) wrote :

I've gone ahead and applied this patch to the hardy grub and uploaded to hardy-proposed. Waiting for SRU processing now.

Changed in grub (Ubuntu Hardy):
assignee: Colin Watson (cjwatson) → Steve Langasek (vorlon)
status: Triaged → In Progress

Accepted grub into hardy-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in grub (Ubuntu Hardy):
status: In Progress → Fix Committed
tags: added: verification-needed
Bela Lubkin (filbo) wrote :

Martin & Colin,

I've analyzed a performance problem in `grub-probe` on my Karmic system, in grub2, and found that it was caused by repeated BLKFLSBUF ioctls. Each call takes about .5 second and the overall probe does the ioctl dozens of times. Of course both of those factors will be different on each system, but in general I would expect them to be worse -- this system has 2.5GHz QC CPU, 4GiB memory, only one disk and only 2 OS partitions -- it could be _much_ worse.

In comment #17, Colin Watson wrote:

> grub needs to close and reopen the disk device, or (more easily) use the BLKFLSBUF ioctl.

In my grub2 `strace grub-probe` output, every single open() of a disk device was followed by a close() of that device before any other disk devices were accessed. In fact, it so happened that fd #4 was used every for every single read from a whole-disk or partition device, so clearly there was no overlap of open times.

Due to this problem, each OS update which changes anything to do with grub (e.g. grub2 updates, but also kernel updates [which trigger a grub rescan & initrd rebuild]) takes many minutes to complete. Karmic has what appear to be dozens if not hundreds of duplicate bug reports about grub install failing -- many with error codes indicating a user interrupt. I take these to mean that the grub update is taking so long that the user manually aborts it, then raises a bug about the abort!

I have not tested your grub1 PPA (or grub1 at all). I just wanted to alert you to this: _if_, during the process of scanning, it ends up doing a lot of BLKFLSBUF ioctls, it will have this problem. I suspect that the alternative solution of always making sure one has closed whole-disk devices before opening partition devices & vice-versa, won't have this performance problem.

I'm going to email this same comment to the Debian 362028 tracker (assuming it will receive email from a random email address...)

Bela Lubkin (filbo) wrote :

Oh, I should also mention that when I [binary] patched the `grub-probe` binary to skip the BLKFLSBUF ioctls, everything else was _MUCH_ faster. Transcript:

# export TIME="%e real %U user %S sys"
# for what in fs fs_uuid drive device partmap; do
> /usr/bin/time /usr/sbin/grub-probe --device /dev/sda3 --target=$what
> done
ext2
9.19 real 0.00 user 0.15 sys
4f0c55b5-6bc6-40a4-b239-f0918b84d57c
10.89 real 0.00 user 0.21 sys
(hd0,3)
2.05 real 0.00 user 0.02 sys
/dev/sda3
2.18 real 0.00 user 0.01 sys
part_msdos
2.66 real 0.00 user 0.05 sys
# for what in fs fs_uuid drive device partmap; do
> /usr/bin/time /tmp/grub-probe --device /dev/sda3 --target=$what
> done
ext2
0.03 real 0.00 user 0.00 sys
4f0c55b5-6bc6-40a4-b239-f0918b84d57c
0.00 real 0.00 user 0.00 sys
(hd0,3)
0.00 real 0.00 user 0.00 sys
/dev/sda3
0.00 real 0.00 user 0.00 sys
part_msdos
0.00 real 0.00 user 0.00 sys
# mount | grep sda3
/dev/sda3 on / type ext4 (rw,errors=remount-ro)
# diff -up <(objdump -d /usr/sbin/grub-probe) <(objdump -d /tmp/grub-probe)
--- /dev/fd/63 2009-12-11 14:21:05.687644648 -0800
+++ /dev/fd/62 2009-12-11 14:21:05.696688380 -0800
@@ -1,5 +1,5 @@

-/usr/sbin/grub-probe: file format elf64-x86-64
+/tmp/grub-probe: file format elf64-x86-64

 Disassembly of section .init:
@@ -1480,7 +1480,7 @@ Disassembly of section .text:
   402807: 0f 88 5e 02 00 00 js 402a6b <fflush@plt+0x13c3>
   40280d: 31 d2 xor %edx,%edx
   40280f: 89 c7 mov %eax,%edi
- 402811: be 61 12 00 00 mov $0x1261,%esi
+ 402811: be 5e 12 00 00 mov $0x125e,%esi
   402816: 31 c0 xor %eax,%eax
   402818: e8 8b eb ff ff callq 4013a8 <ioctl@plt>
   40281d: 85 db test %ebx,%ebx

# grep x12.9[47] /usr/src/linux-headers-`uname -r`/include/linux/fs.h
#define BLKROGET _IO(0x12,94) /* get read-only status (0 = read_write) */
#define BLKFLSBUF _IO(0x12,97) /* flush buffer cache */

Steve Langasek (vorlon) wrote :

I've posted a test case now that documents how to verify this fix for 8.04.4. Can someone please confirm that this fixes the problem?

Bela, I would suggest filing a separate bug report for the performance regression you describe - which I don't think should block us from including this update this in 8.04.4, since grub-probe's performance has fairly little impact on the behavior of the running system.

description: updated
description: updated
Dave Morley (davmor2) wrote :

Colin fix failed.

On 20100121.2 the cd still fails to install grub. Syslog attached

Dave Morley (davmor2) wrote :
Steve Langasek (vorlon) wrote :

Dave,

The above syslog doesn't match the test case. It shows a grub install to the disk instead of the partition ("(hd0)"), and says that /boot is XFS - please test with JFS.

Colin Watson (cjwatson) wrote :

XFS is a special case, known to be broken in hardy for other and more complicated reasons, but fixed in jaunty. See bug 8058.

Dave Morley (davmor2) wrote :
Dave Morley (davmor2) wrote :
Colin Watson (cjwatson) wrote :

Dave's previous attempt (comments 34/35) broke because he had / on /dev/sda6 but was attempting to install to the nonexistent (hd0,0) (a.k.a. /dev/sda1). I then had an attempt that failed due to the problem fixed in http://bazaar.launchpad.net/~ubuntu-core-dev/grub-installer/ubuntu/revision/833 (not a regression; but note that you need to explicitly mark a partition as bootable in the partitioner when performing this test). Dave then had an attempt that failed because he had a former installation on the same disk which of course (intentionally) didn't have its boot loader removed from the MBR.

Once we got past all those roadblocks, though, both of us appear to have working installations now.

Dave Morley (davmor2) wrote :

Just a confirmation on the above I have a booting system and all hardware has been recognised and is functioning as expected :)

Colin Watson (cjwatson) on 2010-01-22
tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub - 0.97-29ubuntu21.2

---------------
grub (0.97-29ubuntu21.2) hardy-proposed; urgency=low

  [ Colin Watson ]
  * cache_coherency.diff: Flush buffer cache on disk devices after writing
    to any partition devices they contain, otherwise we may read old data
    back (LP: #185878).
 -- Steve Langasek <email address hidden> Wed, 09 Dec 2009 19:13:19 -0800

Changed in grub (Ubuntu Hardy):
status: Fix Committed → Fix Released
Neil Plucknett (ranger-neil) wrote :

Trying to install on my PC - that is already running Win-7. I added a 160 Gb drive to take Ubuntu ("Bionic Beaver) and elected to install the software here - /dev/sdd - and the boot partition is the existing 750 Gb boot drive for Windows 7 - /dev/sda - as I want a dual boot system for the time being at least.
Installer failed on the last step saying grub could not be installed to the / filesystem.

peter b (b1pete) wrote :

- I have on sda winXP AND on sdb WIN10 and 16.04
- 18.04 sits/installed on sdf

- did you try/change in BIOS your sdd to be the FIRST BOOT DEVICE/HDD?

- my system/described above boots/works flawlessly - grub sees and boots all 4 OS's.

- pls take a look at

https://askubuntu.com/questions/226061/how-to-install-the-boot-repair-tool-in-an-ubuntu-live-disc
https://www.howtogeek.com/114884/how-to-repair-grub2-when-ubuntu-wont-boot/

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.