unable to mount an ext2 partition by label or uuid, unbootable system

Bug #428318 reported by Lawrence Rust on 2009-09-12
90
This bug affects 14 people
Affects Status Importance Assigned to Milestone
util-linux (Ubuntu)
Medium
Unassigned

Bug Description

Binary package hint: mount

Yesterday (11-sep-09) I ran "upgrade-manager -d" on a Kubuntu 9.04 system to upgrade to karmic koala alpha 5 . After rebooting I had 2 problems:

1. Grub was unable to load any kernel. Fixed by removing the /boot prefix from the kernel and initrd lines.
2. During startup I got a warning "fscheck: unable to resolve 'UUID=41...' fsck died with exit status 8. and was dumped to the command line.

The uuid reported is that of my bootable ext2 partition on which grub is installed. I then have several systems installed on separate partitions that I select from the grub menu. The boot partition is then mounted by uuid at /boot during startup from an entry in fstab.

tune2fs correctly reports the uuid and label of the boot partition and cat /proc/partitions is ok. However, both "mount -U 41... -t ext2 /boot" and "mount -L boot -t ext2 /boot" fail saying that can't find the selected partition. mount does mount the partition by device name.

mount --version gives
mount from util-linux-ng 2.16 (with libblkid and selinux support)

This does not appear to be a kernel problem since I have successfully run a custom 2.6.31 linux kernel with kubuntu 9.04. The identical kernel with karmic koala alpha 5 userland exhibits these same problems.

Could you supply the output of "sudo blkid"

Thanks

Changed in util-linux (Ubuntu):
importance: Undecided → Medium
status: New → Incomplete

This is the output of blkid from karmic alpha 5:

/dev/sda1: SEC_TYPE="msdos" LABEL="WIN98ME" UUID="2B5A-1E00" TYPE="vfat"
/dev/sda2: UUID="D4ACFF30ACFF0BAC" LABEL="WIN2K" TYPE="ntfs"
/dev/sdb3: LABEL="testsys" UUID="2b61ff01-e0df-45fc-86bc-f3db250e1534" SEC_TYPE="ext2" TYPE="ext3"
/dev/sdb5: LABEL="system" UUID="d66dd288-4133-41f9-8ee6-9a0c5ff7805e" SEC_TYPE="ext2" TYPE="ext3"
/dev/sdb6: UUID="ffcc76c6-82ae-4e15-b1eb-8847458b48cd" TYPE="swap"
/dev/sdb7: LABEL="home" UUID="5d557015-be1c-4f8d-8343-875ec53973f7" SEC_TYPE="ext2" TYPE="ext3"

This is the output of blkid on Hardy Heron

/dev/sda1: SEC_TYPE="msdos" LABEL="WIN98ME" UUID="2B5A-1E00" TYPE="vfat"
/dev/sda2: UUID="D4ACFF30ACFF0BAC" LABEL="WIN2K" TYPE="ntfs"
/dev/sdb1: LABEL="boot" UUID="41584c58-0cef-4d78-acd1-9a01ebd87833" TYPE="ext2"
/dev/sdb3: LABEL="testsys" UUID="2b61ff01-e0df-45fc-86bc-f3db250e1534" SEC_TYPE="ext2" TYPE="ext3"
/dev/sdb5: LABEL="system" UUID="d66dd288-4133-41f9-8ee6-9a0c5ff7805e" TYPE="ext3"
/dev/sdb6: TYPE="swap" UUID="ffcc76c6-82ae-4e15-b1eb-8847458b48cd"
/dev/sdb7: LABEL="home" UUID="5d557015-be1c-4f8d-8343-875ec53973f7" SEC_TYPE="ext2" TYPE="ext3"

Looks like libblkid isn't seeing /dev/sdb1, the ext2 grub boot partition. For completeness I've added /proc/partions and tune2fs output.

PS /dev/sdb was originally partitioned with Win2K. Kubuntu 6.06 was installed into unpartioned space in Jun 06. The final partition layout was made with gparted-livecd-0.3.4-11 in April 08 when I added the boot and testsys partitions and resized system.

major minor #blocks name

   8 0 30018240 sda
   8 1 1542208 sda1
   8 2 28467180 sda2
   8 16 245117376 sdb
   8 17 1469916 sdb1
   8 18 1 sdb2
   8 19 23005080 sdb3
   8 21 12434278 sdb5
   8 22 1124518 sdb6
   8 23 207077818 sdb7

tune2fs 1.41.9 (22-Aug-2009)
Filesystem volume name: boot
Last mounted on: <not available>
Filesystem UUID: 41584c58-0cef-4d78-acd1-9a01ebd87833
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: filetype sparse_super
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 735840
Block count: 1469916
Reserved block count: 73495
Free blocks: 1357119
Free inodes: 735791
First block: 1
Block size: 1024
Fragment size: 1024
Blocks per group: 8192
Fragments per group: 8192
Inodes per group: 4088
Inode blocks per group: 511
Last mount time: Wed Sep 16 17:55:10 2009
Last write time: Wed Sep 16 17:57:40 2009
Mount count: 1
Maximum mount count: 30
Last checked: Wed Sep 16 17:55:03 2009
Check interval: 0 (<none>)
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 128

On Wed, 2009-09-16 at 16:46 +0000, Lawrence Rust wrote:

> Looks like libblkid isn't seeing /dev/sdb1, the ext2 grub boot
> partition. For completeness I've added /proc/partions and tune2fs
> output.
>
Indeed.

Could you run "blkid -p /dev/sdb1" and provide the output.

Scott
--
Scott James Remnant
<email address hidden>

blkid -p /dev/sdb1
/dev/sdb1: ambivalent result (probably more filesystems on the device)

Looks like a bug in libblkid. I mounted an 8.04 system partition and ran /mnt/sblin/blkid and that also failed to identify sdb1 (it works correctly when running 8.04). Unfortunately/predictably LD_LIBRARY_PATH=/mnt/lib /mnt/sbin/blkid caused a seg fault.

Unfortunately it isn't a blkid bug, your filesystem type genuinely can't be determined because it contains multiple metadata. ie. it could be validly mounted as one of two filesystem types, one of which is going to be wrong and result in corruption.

We generally recommend in this situation that you reformat the device involved with a tool that takes care to wipe other metadata first (e.g. parted). Since /boot is fairly small, you should be able to copy the data off to /tmp/boot, unmount, reformat, and copy back (adjusting fstab for the new UUID).

If that's not possible, you can workaround the issue by changing the UUID in /etc/fstab to a hardcoded device name; however this will not work if your devices frequently change their assigned names (as can happen if you change SCSI or IDE layout, and on some machines even boot with USB devices plugged in).

Changed in util-linux (Ubuntu):
status: Incomplete → Won't Fix

I don't agree, this is a bug either in the base version of e2fsprogs used or in the Ubuntu patches. To prove this I built the latest e2fsprogs-1.41.9.tar.gz from SourceForge and ran

sudo /home/lvr/e2fs/sbin/blkid -c /dev/null /dev/sdb1
/dev/sdb1: LABEL="boot" UUID="41584c58-0cef-4d78-acd1-9a01ebd87833" TYPE="ext2"

So the latest official tools have no difficulty in identifying the partition and type. Not only that but the line in /etc/fstab has the FS type fully specified so no excuses for not identifying it.

Re. your suggestion to wipe and reformat - I stated earlier that I used gparted to create this partition so that won't help. _All_ other major Linux distros (suse, fedora, gentoo, mandriva, slackware) recognize this partition - that must say something.

This is a real killer bug for me. If Karmic ships with this bug then I will have to jump ships.

On Thu, 2009-09-17 at 15:36 +0000, Lawrence Rust wrote:

> I don't agree, this is a bug either in the base version of e2fsprogs
> used or in the Ubuntu patches. To prove this I built the latest
> e2fsprogs-1.41.9.tar.gz from SourceForge and ran
>
While you don't agree, I'm afraid this is way things are.

There are two detectable filesystem metadata on your block device. If
we prioritised the one that's right for you, we would have bugs from
everybody who wanted the other one.

> sudo /home/lvr/e2fs/sbin/blkid -c /dev/null /dev/sdb1
> /dev/sdb1: LABEL="boot" UUID="41584c58-0cef-4d78-acd1-9a01ebd87833" TYPE="ext2"
>
> So the latest official tools have no difficulty in identifying the
> partition and type. Not only that but the line in /etc/fstab has the FS
> type fully specified so no excuses for not identifying it.
>
The official version of blkid comes from util-linux-ng, not from
e2fsprogs.

> Re. your suggestion to wipe and reformat - I stated earlier that I used
> gparted to create this partition so that won't help. _All_ other major
> Linux distros (suse, fedora, gentoo, mandriva, slackware) recognize this
> partition - that must say something.
>
No they don't, all the current releases might - but then the current
release of Ubuntu does too.

All of the development releases have standardised on the new blkid from
util-linux-ng (which behaves as vol_id used to). So you'll find that
ALL other major distros, in their next release, do not report the
contents of your block device.

> This is a real killer bug for me. If Karmic ships with this bug then I
> will have to jump ships.
>
Bon voyage.

You'll find the same problem there too.

Scott
--
Scott James Remnant
<email address hidden>

Apologies for the delay in replying but I've been away for the last week.

I fully take on board your comments and have reviewed the code in util-linux-ng-2.15, noting that libblkid now uses a completely rewritten method of probing for file systems.

It's clear that this problem stems from the new detection routines in libs/blkid/src/probers/vfat.c. At the end of this file the struct blkid_idinfo is declared with some magic search strings. In particular, 2 patterns are defined to match the single byte jmp/bra opcodes at the start of a DOS boot sector. My /dev/sdb1 partition was once formatted for DOS and so matches these patterns and consequently the function probe_vfat is called. For these simple patterns the function probe_fat_nomagic is called to filter out false positives. In normal circumstances only MSDOS 2 and earlier floppies should be detected by this function. A very important feature of these disks is the boot signature - bytes 0x55, 0xaa at the end of the 1st sector which indicate to the BIOS that the disk is bootable. All MSDOS floppies have these bytes but the function probe_fat_nomagic doesn't test for them.

When I used gparted to reformat this DOS partition to ext2 it overwrote the MSDOS system name (offset 3), the FAT16 magic signature (offset 0x36) and the boot signature at offset 0x1fe but left the BPB intact, which causes the confusion. To fix this I wrote a small patch which I have attached.

I sincerely hope that you can add this to the 9.10 release in case it affects many other users with older systems.

On Mon, 2009-09-28 at 15:15 +0000, Lawrence Rust wrote:

> I sincerely hope that you can add this to the 9.10 release in case it
> affects many other users with older systems.
>
It's unfortunately too late to get that into 9.10, but if you could
submit the patch upstream and have it reviewed there - we'll be certain
to pull it in for the next release.

Scott
--
Scott James Remnant
<email address hidden>

The patch has been accepted upstream, is comment 8 still an acceptable backport?

I am also affected and spent some time with upstart trying to get my system to boot again, which is why I'd like to see this fixed in karmic.

Gabriel de Perthuis (g2p) wrote :

Forgot to mention: the patch in comment 8 does work for me. There is a slight path change (from libs to shlibs) to find vfat.c, then it applies cleanly and runs correctly.

The patch in comment 8 is incomplete. There are 2 further errors in vfat.c that need to be fixed relating to the size of the ms_dummy2 and vs_dummy2 fields. The attached patch is correct for the latest version 2.16.1.

Gabriel de Perthuis (g2p) wrote :

Ah yes, the old code put the magic at 0xfe when the end of the first sector is actually at 0x1fe.
Also, I fixed a small nitpick to compile on karmic: you need spaces around the minus sign between literals.

summary: - unable to mount an ext2 partition by label or uuid
+ unable to mount an ext2 partition by label or uuid, unbootable system
Changed in util-linux (Ubuntu):
status: Won't Fix → Confirmed
Changed in util-linux (Ubuntu):
status: Confirmed → Triaged
Steve Newcomb (srn-coolheads) wrote :

I thought it might be sufficient to wipe the beginning of the disk in order to fix this. It wasn't sufficient. I finally wiped the whole disk. I used

badblocks -w -t 0 -v <device>

to write zeroes all over it. It took a long time.

Thanks, Scott and Lawrence, for the excellent information and explanation. This was a scary mystery for me, esp. since Debian Lenny had no problems with the same disk. Like Lawrence, I was beginning to feel betrayed by Ubuntu. In my case, the conflicting types that caused the ambiguity were "reiserfs" (the current one that would mount neither by-label nor by-uuid) vs. "mdraid" (the previous occupant of the disk).

I am bitten by this bug as well. Rather than change the way blkid detects filesystems, shouldn't we just make available tools to repair the 'multiple filesystems' situation? I can say for sure that the partition in question contains an ext3 and NOT a vfat, especially since it still mounts as ext3 when I mount it manually.

Paweł Hikiert (nsilent22) wrote :

Well, I had similiar situation as described in Andreas' report here: https://bugs.launchpad.net/ubuntu/+source/util-linux/+bug/464411
but in my case doing dd bs=1 count=512 if=/dev/zero of=/dev/sda1 was sufficient (there were remainings of something that looked as old fat bootsector)

@coding: shouldn't this read 'bs=512 count=1' in your dd example?

I solved this for me by running

sudo dd bs=512 count=1 if=/dev/zero of=/dev/sda1

which removes the vfat identifier from the partition but leaves ext3 intact. Remember to replace sda1 with your problematic partition.

Perhaps this issue should be mentioned in the Karmic release notes?

Paweł Hikiert (nsilent22) wrote :

@Andreas: It really doesn't matter if it is 512 times 1 byte or 1 time 512 bytes, but yes - doing one write is better ;)

Worryingly, I've just encountered this whilst doing a dist-upgrade from 9.04 to 9.10. The 9.04 installation was a complete re-partition and install on an Acer Aspire one, so it's possible that the remains of the toy Linux distro it came with caused the confusion; also possible is that the problem is that the root filesystem is ext2.

Whatever, an Acer Aspire one installed with 9.04 over the toy Linux and then upgraded to 9.10 is probably a pretty common trajectory.

I've recovered the system by specifying the root filesystem as /dev/sda1 and I'm not proposing to do anything further. If I can provide any more diagnostics, please shout.

Simon.

jablko (ms419) wrote :

same problem here,

not sure if i should,

1) run, sudo dd bs=512 count=1 if=/dev/zero of=/dev/sda1
2) temporarily edit /etc/fstab and change UUID=10fe7e91-37ac-4cdd-ac90-96b68a880d33 -> /dev/sda1, run update-grub, and wait for updated util-linux package

@jablko: I would backup your partition (sda1?) ASAP and then try 1, as this will hopefully fix the problem at the root. 2 is only a workaround (if it works at all).

This issue is starting to appear on the German press:

http://www.golem.de/0911/70976.html

I think there should be a better communication about it, at least by mentioning it in the release notes.

jablko (ms419) wrote :

is this bug a duplicate of bug #426027?

jablko (ms419) wrote :

sorry, instead of /etc/fstab, i edited /boot/grub/menu.list,

@@ -63,7 +63,7 @@
 ## e.g. kopt=root=/dev/hda1 ro
 ## kopt_2_6_8=root=/dev/hdc1 ro
 ## kopt_2_6_8_2_686=root=/dev/hdc2 ro
-# kopt=root=UUID=0f9ee8ea-922f-47a4-80da-a50c95a98e5c ro
+# kopt=root=/dev/sda1 ro

 ## default grub root device
 ## e.g. groot=(hd0,0)

- and ran update-grub

this workaround got this system booting again

Chris Knight (cknight1234) wrote :

I'll add a "me too" to this list. Recently upgraded from 9.04 and I've got an unidentifiable /dev/sda1 boot partition of type ext2. I can manually mount this in a recovery shell, but what an awful experience on the upgrade as it took several days of searching before I figured out how to correct the corrupted grub menu.lst and find a workaround to the unmounted partition (I'm no expert). I'll try the 'dd' command later.

Another "me too" here.. After upgrading Jaunty -> Karmic I got into this trouble with root partition /dev/sdc1. Using dd to zero out the boot sector of /dev/sdc1 fixed it for me.

Below is a dump of my old boot sector:

# dd if=/dev/sdc1 of=sdc1_bootsector bs=512 count=1
# hexdump -C sdc1_bootsector

00000000 eb 3c 90 00 53 57 49 4e 34 2e 31 00 02 08 01 00 |.<..SWIN4.1.....|
00000010 02 00 02 7d 3e f8 06 00 3f 00 ff 00 3f 00 00 00 |...}>...?...?...|
00000020 00 00 00 00 80 00 29 fb 14 70 40 4e 4f 20 4e 41 |......)..p@NO NA|
00000030 4d 45 20 20 20 20 00 41 54 31 32 20 20 20 33 c9 |ME .AT12 3.|
00000040 8e d1 bc fc 7b 16 07 bd 78 00 c5 76 00 1e 56 16 |....{...x..v..V.|
00000050 55 bf 22 05 89 7e 00 89 4e 02 b1 0b fc f3 a4 06 |U."..~..N.......|
00000060 1f bd 00 7c c6 45 fe 0f 38 4e 24 7d 20 8b c1 99 |...|.E..8N$} ...|
00000070 e8 7e 01 83 eb 3a 66 a1 1c 7c 66 3b 07 8a 57 fc |.~...:f..|f;..W.|
00000080 75 06 80 ca 02 88 56 02 80 c3 10 73 ed 33 c9 fe |u.....V....s.3..|
00000090 06 d8 7d 8a 46 10 98 f7 66 16 03 46 1c 13 56 1e |..}.F...f..F..V.|
000000a0 03 46 0e 13 d1 8b 76 11 60 89 46 fc 89 56 fe b8 |.F....v.`.F..V..|
000000b0 20 00 f7 e6 8b 5e 0b 03 c3 48 f7 f3 01 46 fc 11 | ....^...H...F..|
000000c0 4e fe 61 bf 00 07 e8 28 01 72 3e 38 2d 74 17 60 |N.a....(.r>8-t.`|
000000d0 b1 0b be d8 7d f3 a6 61 74 3d 4e 74 09 83 c7 20 |....}..at=Nt... |
000000e0 3b fb 72 e7 eb dd fe 0e d8 7d 7b a7 be 7f 7d ac |;.r......}{...}.|
000000f0 98 03 f0 ac 98 40 74 0c 48 74 13 b4 0e bb 07 00 |.....@t.Ht......|
00000100 cd 10 eb ef be 82 7d eb e6 be 80 7d eb e1 cd 16 |......}....}....|
00000110 5e 1f 66 8f 04 cd 19 be 81 7d 8b 7d 1a 8d 45 fe |^.f......}.}..E.|
00000120 8a 4e 0d f7 e1 03 46 fc 13 56 fe b1 04 e8 c2 00 |.N....F..V......|
00000130 72 d7 ea 00 02 70 00 52 50 06 53 6a 01 6a 10 91 |r....p.RP.Sj.j..|
00000140 8b 46 18 a2 26 05 96 92 33 d2 f7 f6 91 f7 f6 42 |.F..&...3......B|
00000150 87 ca f7 76 1a 8a f2 8a e8 c0 cc 02 0a cc b8 01 |...v............|
00000160 02 80 7e 02 0e 75 04 b4 42 8b f4 8a 56 24 cd 13 |..~..u..B...V$..|
00000170 61 61 72 0a 40 75 01 42 03 5e 0b 49 75 77 c3 03 |aar.@u.B.^.Iuw..|
00000180 18 01 27 0d 0a 49 6e 76 61 6c 69 64 20 73 79 73 |..'..Invalid sys|
00000190 74 65 6d 20 64 69 73 6b ff 0d 0a 44 69 73 6b 20 |tem disk...Disk |
000001a0 49 2f 4f 20 65 72 72 6f 72 ff 0d 0a 52 65 70 6c |I/O error...Repl|
000001b0 61 63 65 20 74 68 65 20 64 69 73 6b 2c 20 61 6e |ace the disk, an|
000001c0 64 20 74 68 65 6e 20 70 72 65 73 73 20 61 6e 79 |d then press any|
000001d0 20 6b 65 79 0d 0a 00 00 49 4f 20 20 20 20 20 20 | key....IO |
000001e0 53 59 53 4d 53 44 4f 53 20 20 20 53 59 53 7f 01 |SYSMSDOS SYS..|
000001f0 00 41 bb 00 07 60 66 6a 00 e9 3b ff 00 00 00 00 |.A...`fj..;.....|

Looks like the remains of an old Windows version (MSWIN4.1), but no signatures at 0x03, 0x36 or 0x1fe.
Seems to be the same situation as Lawrence in #8 ..

Francesco Potortì (pot) wrote :

This should definitely be high priority. After upgrading from 9.4 to 9.10 I only managed to have my system working after several hours because I am knowledgeable about Linux. A normal user would have had their system unbootable without any hint of what is going wrong.

Also, the fix is trivial and should definitely be backported, just use the newest util-linux package.

Tom Robinson (zxrobinson) wrote :

Please consider fixing this in 9.10. I upgraded from 9.04 to 9.10 in preparation for the release of 10.04. I had no problem booting for several months. Then, all of a sudden, after updating and rebooting, I couldn't boot. Based on the error message, I created the symbolic link for my Ubuntu 9.10 partition in /dev/disk/by-uuid, rebooted and was able to use the system for another period of time. This time, I couldn't create the symbolic link and then boot (not sure why it worked the first time and after). I believe this partition was created (with gparted) as FAT32 and then changed to ext3 via gparted.

The tools in the util-linux-ng package seem inconsistent. When I boot with a Ubuntu LiveCD (9.10 32-bit Desktop), I get the following:

ubuntu@ubuntu:~$ sudo fsck -t ext3 /dev/sda8
fsck from util-linux-ng 2.16
e2fsck 1.41.9 (22-Aug-2009)
/dev/sda8: clean, 314232/3141600 files, 6959167/12611017 blocks (check after next mount)

ubuntu@ubuntu:~$ sudo blkid -p /dev/sda8
/dev/sda8: ambivalent result (probably more filesystems on the device)

ubuntu@ubuntu:~$ sudo sfdisk -l
....
/dev/sda8 50173+ 56452 6280- 50444068+ 83 Linux
....

If people think it's sufficient to let users just fix this themselves, then they are relegating Ubuntu Linux to the role of an O/S for developers and maybe server farm operators, but not as a desktop for home users or small businesses. To be usable by non-technical people, the partitioning tools need to convert partitions/create partitions that are acceptable to the other tools in the distribution. Replacement tools need to work with the limitations of the rest of the distribution (or there needs to be a reasonable non-technical way to fix the problem).

Dave Gilbert (ubuntu-treblig) wrote :

Looking at Gabriel's 2.16.1 patch, it looks like that is in the 2.17.2 code in Lucid and Maverick;
so the question is - has this problem gone away in Lucid (or Maverick?) ?

Or is there something else needed as well.

codewarrior (lvr) wrote :

My patch was accepted upstream into util-linux- ng-2.17. Lucid ships with 2.17.2 and I can confirm that this problem doesn't occur with Lucid.

Dave Gilbert (ubuntu-treblig) wrote :

Thanks for fixing it - marked fixed-released.

Changed in util-linux (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers