blkid identifies ext4 rootfs as silicon_medley_raid_member, breaks boot

Bug #1011007 reported by Csaba
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Fedora
Invalid
Critical
util-linux (Ubuntu)
Incomplete
Undecided
Unassigned

Bug Description

Ubuntu does not boot
/disk/by-uuid not found
blkid reports silicon_medley_raid_member
tryed this
https://bugzilla.redhat.com/show_bug.cgi?id=730502

tips: find out what's on the disk at the specified location

ubuntu@ubuntu:/media$ sudo env BLKID_DEBUG=0xffff blkid -p /dev/sda1libblkid: debug mask set to 0xffff.
ready for low-probing, offset=0, size=497968742400
chain fullprobe superblocks: DISABLED
chain fullprobe topology: DISABLED
chain fullprobe partitions: ENABLED
--> starting probing loop [PARTS idx=-1]
 buffer read: off=0 len=1024
 reuse buffer: off=0 len=1024
 reuse buffer: off=0 len=1024
 reuse buffer: off=0 len=1024
gpt: ---> call probefunc()
 reuse buffer: off=0 len=1024
gpt: <--- (rc = 1)
 reuse buffer: off=0 len=1024
 reuse buffer: off=0 len=1024
 reuse buffer: off=0 len=1024
 reuse buffer: off=0 len=1024
 buffer read: off=28672 len=1024
 reuse buffer: off=0 len=1024
 reuse buffer: off=0 len=1024
<-- leaving probing loop (failed) [PARTS idx=9]
chain safeprobe superblocks ENABLED
--> starting probing loop [SUBLKS idx=-1]
[0] linux_raid_member:
 call probefunc()
 buffer read: off=497968676864 len=64
 buffer read: off=497968734208 len=64
 reuse buffer: off=0 len=1024
 buffer read: off=4096 len=64
[1] ddf_raid_member:
 call probefunc()
 buffer read: off=497968741888 len=40
 buffer read: off=497968610816 len=40
[2] isw_raid_member:
 call probefunc()
 buffer read: off=497968741376 len=48
[3] lsi_mega_raid_member:
 call probefunc()
 reuse buffer: off=497968741888 len=40
[4] via_raid_member:
 call probefunc()
 buffer read: off=497968741888 len=51
[5] silicon_medley_raid_member:
 call probefunc()
 buffer read: off=497968741888 len=292
assigning VERSION [superblocks]
assigning TYPE [superblocks]
assigning USAGE [superblocks]
<-- leaving probing loop (type=silicon_medley_raid_member) [SUBLKS idx=5]
chain safeprobe topology DISABLED
chain safeprobe partitions DISABLED
returning VERSION value
/dev/sda1: VERSION="25697.25960" returning TYPE value
TYPE="silicon_medley_raid_member" returning USAGE value
USAGE="raid"
reseting probing buffers
buffers summary: 42949675671 bytes by 617802423873589236 read() call(s)

ubuntu@ubuntu:/media$ sudo dd if=/dev/sda1 bs=1 skip=497968741887 count=200 | od -tx1z
0000000 13 03 bf 1b 67 42 9a 00 00 00 00 2f 75 73 72 2f >....gB...../usr/<
0000020 73 72 63 2f 6c 69 6e 75 78 2d 68 65 61 64 65 72 >src/linux-header<
0000040 73 2d 32 2e 36 2e 33 38 2d 38 2d 67 65 6e 65 72 >s-2.6.38-8-gener<
0000060 69 63 2f 69 6e 63 6c 75 64 65 2f 63 6f 6e 66 69 >ic/include/confi<
0000100 67 2f 63 79 63 6c 61 64 65 73 00 00 73 79 6e 63 >g/cyclades..sync<
0000120 2e 68 00 02 00 00 00 00 4e 13 03 bf 1b 67 42 9a >.h......N....gB.<
0000140 00 00 00 00 2f 75 73 72 2f 73 72 63 2f 6c 69 6e >..../usr/src/lin<
0000160 75 78 2d 68 65 61 64 65 72 73 2d 32 2e 36 2e 33 >ux-headers-2.6.3<
0000200 38 2d 38 2d 67 65 6e 65 72 69 63 2f 69 6e 63 6c >8-8-generic/incl<
0000220 75 64 65 2f 63 6f 6e 66 69 67 2f 63 79 63 6c 6f >ude/config/cyclo<
0000240 6d 78 00 00 78 32 35 2e 68 00 02 00 00 00 00 4e >mx..x25.h......N<
0000260 13 03 bf 1b 67 42 9a 00 00 00 00 2f 75 73 72 2f >....gB...../usr/<
0000300 73 72 63 2f 6c 69 6e 75 >src/linu<
0000310

in my case it was a file linux-headers-2.6.38-8-generic.list
so i simply overwriten it
ubuntu@ubuntu:/media$ sudo dd if=/dev/zero bs=1 count=292 seek=497968741888 of=/dev/sda1
292+0 records in
292+0 records out
292 bytes (292 B) copied, 0.000560058 s, 521 kB/s

ubuntu@ubuntu:/media$ sudo dd if=/dev/sda1 bs=1 skip=497968741887 count=200 | od -tx1z
0000000 13 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >................<
0000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >................<
*
0000300 00 00 00 00 00 00 00 00 >........<
0000310
200+0 records in
200+0 records out
200 bytes (200 B) copied, 0.00189193 s, 106 kB/s

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: udisks 1.0.1-1ubuntu1
ProcVersionSignature: Ubuntu 2.6.32-33.70-generic 2.6.32.41+drm33.18
Uname: Linux 2.6.32-33-generic i686
Architecture: i386
Date: Sat Jun 9 21:35:05 2012
LiveMediaBuild: Ubuntu 10.04.3 LTS "Lucid Lynx" - Release i386 (20110720.1)
MachineType: Dell Inc. Inspiron 1525
ProcCmdLine: BOOT_IMAGE=/casper/vmlinuz file=/cdrom/preseed/hostname.seed boot=casper initrd=/casper/initrd.lz quiet splash -- maybe-ubiquity
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: udisks
Symptom: storage
Title: Internal hard disk partition cannot be mounted manually
dmi.bios.date: 06/27/2008
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A13
dmi.board.name: 0U990C
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA13:bd06/27/2008:svnDellInc.:pnInspiron1525:pvr:rvnDellInc.:rn0U990C:rvr:cvnDellInc.:ct8:cvr:
dmi.product.name: Inspiron 1525
dmi.sys.vendor: Dell Inc.

Revision history for this message
In , Sandro (sandro-redhat-bugs) wrote :

I've selected lvm2 as component because it seemed the most related component but I don't really know if it is the right choice.

Description of problem:
This morning my system worked fine.
After reboot without any new install the system now tell me that no root can be found and where it should be an ext4 file system now it seems that there is a silicon_medley_raid_member.

Executed S.M.A.R.T. short test by BIOS, everything is ok.
Using the F15 DVD I've opened a rescue shell, executed vgck and pvck and everything seemed ok.
Activated the lvm volume, called fsck.ext4 -v on vg_root, no errors found.
Called fsck.ext4 -f -v on vg_root, no errors found.

mount -t ext4 of vg_root works fine but if not specified, the fs type detected is silicon_medley_raid_member.

I'm doing a backup of all relevant data on my system but I would understand what is happened and, if possible, restore the system without reinstalling everything.

How reproducible:
Always reproducible

Any idea on what happened?

Revision history for this message
In , Milan (milan-redhat-bugs) wrote :

Seems that system now see some fake raid signature on the disk, I guess mdadm update started to recognize some old SiL RAID signature.

If you do not use raid, try to add rd_NO_MD to kernel boot parameters (or equivalent, see man dracut / dracut.kernel).

Revision history for this message
In , Sandro (sandro-redhat-bugs) wrote :

rd_NO_MD and rd_NO_DM already specified in grub kernel boot parameters.
The notebook has no bios raid and just 1HD, I never used RAID on this system.

It seems something like this: http://ubuntuforums.org/showthread.php?t=1711929

The only difference is that I'm using ext4 instead of ext3 so ext4 works fine if specified. Maybe ext4 and silicon_medley_raid_member have similar signature and just some bits are somehow corrupted?

Revision history for this message
In , Sandro (sandro-redhat-bugs) wrote :

BLKID_DEBUG=0xffff blkid on the lvm volu,e says:

libblkid: debug mask set to 0xffff.
creating blkid cache (using default cache)
need to revalidate lv_root (cache time 2147483648.0, stat time 1313584184.138354,
 time since last check 3461068058)
ready for low-probing, offset=0, size=20971520000
found entire diskname for devno 0xfd01 as dm-1
whole-disk: YES, regfile: NO
zeroize wiper
chain safeprobe superblocks ENABLED
--> starting probing loop [SUBLKS idx=-1]
[0] linux_raid_member:
 call probefunc()
 buffer read: off=20971454464 len=64
 buffer read: off=20971511808 len=256
 buffer read: off=0 len=256
 buffer read: off=4096 len=256
[1] ddf_raid_member:
 call probefunc()
 buffer read: off=20971519488 len=512
 buffer read: off=20971388416 len=512
[2] isw_raid_member:
 call probefunc()
 buffer read: off=20971518976 len=48
[3] lsi_mega_raid_member:
 call probefunc()
 reuse buffer: off=20971519488 len=512
[4] via_raid_member:
 call probefunc()
 reuse buffer: off=20971519488 len=512
[5] silicon_medley_raid_member:
 call probefunc()
 reuse buffer: off=20971519488 len=512
assigning TYPE [superblocks]
<-- leaving probing loop (type=silicon_medley_raid_member) [SUBLKS idx=5]
chain safeprobe topology DISABLED
chain safeprobe partitions DISABLED
zeroize wiper
returning TYPE value
    creating new cache tag head TYPE
lv_root: devno 0xfd01, type silicon_medley_raid_member
reseting probing buffers
buffers summary: 1904 bytes by 7 read() call(s)
lv_root: TYPE="silicon_medley_raid_member"
writing cache file /etc/blkid/blkid.tab (really /etc/blkid/blkid.tab)
freeing cache struct
  freeing dev lv_root (silicon_medley_raid_member)
  dev: name = lv_root
  dev: DEVNO="0xfd01"
  dev: TIME="1313584410.543761"
  dev: PRI="0"
  dev: flags = 0x00000001
    tag: TYPE="silicon_medley_raid_member"

    freeing tag TYPE=silicon_medley_raid_member
    tag: TYPE="silicon_medley_raid_member"
    freeing tag TYPE=(NULL)
    tag: TYPE="(null)"

Revision history for this message
In , Milan (milan-redhat-bugs) wrote :

I do not think blkid detect it wrongly, perhaps just there are both signatures.

Perhaps "dmraid -E" or wipefs can wipe the fake raid signature?

Revision history for this message
In , Sandro (sandro-redhat-bugs) wrote :

wipefs on lv_root shows:

offset type
----------------------------------------------------------------
0x438 ext4 [filesystem]
                     UUID: 0fb3d4f8-fea5-4d22-ae67-91e897d67c14

0x4e1fffe60 silicon_medley_raid_member [raid]

Revision history for this message
In , Heinz (heinz-redhat-bugs) wrote :

(In reply to comment #5)
> wipefs on lv_root shows:
>
>
> offset type
> ----------------------------------------------------------------
> 0x438 ext4 [filesystem]
> UUID: 0fb3d4f8-fea5-4d22-ae67-91e897d67c14
>
> 0x4e1fffe60 silicon_medley_raid_member [raid]

Did you try "dmraid -E" yet, like Milan proposed?

Revision history for this message
In , Sandro (sandro-redhat-bugs) wrote :

dmraid -E -r says no raid disk with name lv_root.
The offset for silicon_medley_raid_member seems to be quite high.
Is it safe to call wipefs -o 0x4e1fffe60 on lv_root?

Revision history for this message
In , Sandro (sandro-redhat-bugs) wrote :

called wipefs -o 0x4e1fffe60 on lv_root, now the mount command detects correctly ext4 fs.
The system now can boot.
Any idea on what have caused this issue or any hint on where searching the cause?

Revision history for this message
In , Milan (milan-redhat-bugs) wrote :

That disk was probably part of fake raid array before and signature was still there.
After mdadm or blkid update it starts to prefer raid signature before ext4 one.

I guess the issue can be closed now, right?

Revision history for this message
In , Sandro (sandro-redhat-bugs) wrote :

(In reply to comment #9)
> That disk was probably part of fake raid array before and signature was still
> there.

Never used RAID on that system. If the signature was there, it was just garbage left from a previous partition not zeroed during a partition resizing / format.

> After mdadm or blkid update it starts to prefer raid signature before ext4 one.
>
> I guess the issue can be closed now, right?

Well, the system now seems to be ok. I'm just curious about the dynamic of the incident but I can set the status to closed notabug. I've choosen notabug because I can't find any evidence that a specific package caused the issue.

Revision history for this message
Csaba (csab16) wrote :
affects: ubuntu → udisks (Ubuntu)
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in udisks (Ubuntu):
status: New → Confirmed
Malte S. Stretz (mss)
affects: udisks (Ubuntu) → util-linux (Ubuntu)
summary: - Ubuntu does not boot, /disk/by-uuid not mounted, blkid
- silicon_medley_raid_member
+ blkid identifies ext4 rootfs as silicon_medley_raid_member, breaks boot
Revision history for this message
Malte S. Stretz (mss) wrote :
Download full text (5.2 KiB)

Same problem here: After a reboot, Ubuntu 12.04 didn't mount the rootfs properly anymore, dropping me into busybox with a misleading error message (cf. attached acreenshot). Looks like libblkid looks for the magic number 0x2F000000 at byte 96 of the last (partial) 512 byte block of the device. Unfortunatley whatever file was written there was full of this magic number:

root@Otherland:~# blkid -p /dev/mapper/hd-sys
/dev/mapper/hd-sys: VERSION="12032.0" TYPE="silicon_medley_raid_member" USAGE="raid"
root@Otherland:~# env BLKID_DEBUG=0xffff blkid -p /dev/mapper/hd-sys
libblkid: debug mask set to 0xffff.
allocate a new probe 0x1f22030
ready for low-probing, offset=0, size=14147387392
found entire diskname for devno 0xfc0a
whole-disk: YES, regfile: NO
zeroize wiper
chain safeprobe superblocks ENABLED
--> starting probing loop [SUBLKS idx=-1]
[0] linux_raid_member:
        call probefunc()
        buffer read: off=14147321856 len=64 pr=0x1f22030
        buffer read: off=14147379200 len=256 pr=0x1f22030
        buffer read: off=0 len=256 pr=0x1f22030
        buffer read: off=4096 len=256 pr=0x1f22030
[1] ddf_raid_member:
        call probefunc()
        buffer read: off=14147386880 len=512 pr=0x1f22030
        buffer read: off=14147255808 len=512 pr=0x1f22030
[2] isw_raid_member:
        call probefunc()
        buffer read: off=14147386368 len=48 pr=0x1f22030
[3] lsi_mega_raid_member:
        call probefunc()
        reuse buffer: off=14147386880 len=512 pr=0x1f22030
[4] via_raid_member:
        call probefunc()
        reuse buffer: off=14147386880 len=512 pr=0x1f22030
[5] silicon_medley_raid_member:
        call probefunc()
        reuse buffer: off=14147386880 len=512 pr=0x1f22030
assigning VERSION [superblocks]
assigning TYPE [superblocks]
assigning USAGE [superblocks]
<-- leaving probing loop (type=silicon_medley_raid_member) [SUBLKS idx=5]
chain safeprobe topology DISABLED
chain safeprobe partitions ENABLED
zeroize wiper
returning TYPE value
returning VERSION value
/dev/mapper/hd-sys: VERSION="12032.0" returning TYPE value
TYPE="silicon_medley_raid_member" returning USAGE value
USAGE="raid"
reseting probing buffers pr=0x1f22030
buffers summary: 1904 bytes by 7 read() call(s)
free probe 0x1f22030
root@Otherland:~# wipefs /dev/mapper/hd-sys
offset type
----------------------------------------------------------------
0x438 ext4 [filesystem]
                     LABEL: root
                     UUID: 55c1f04f-4e15-48e4-9f21-8f71657d0682

0x34b3ffe60 silicon_medley_raid_member [raid]

root@Otherland:~# blkid -h | grep util-linux
blkid from util-linux 2.20.1 (libblkid 2.20.0, 19-Oct-2011)
root@Otherland:~# dd if=/dev/mapper/hd-sys bs=1 skip=14147386880 count=512 | od -v -tx1z
512+0 records in
512+0 records out
0000000 00 00 00 2f 00 00 00 2f 00 00 00 2f 00 00 00 2f >.../.../.../.../<
0000020 00 00 00 2f 00 00 00 2f 00 00 00 2f 00 00 00 2f >.../.../.../.../<
0000040 00 00 00 2f 00 00 00 2f 00 00 00 2f 00 00 00 2f >.../.../.../.../<
0000060 00 00 00 2f 00 00 00 2f 00 00 00 2f 00 00 00 2f >.../.../.../.../<
0000100 00 00 00 2f 00 00 00 2f 00 00 00 2f 00 00 00 2f >.../.../.../.../<
0000120 00 00 00 2f...

Read more...

Revision history for this message
Malte S. Stretz (mss) wrote :

Since this was a LVM device which was misidentified, I fixed it without fiddling with the file system by extending the device slightly:

root@Otherland:~# lvextend -L +512B /dev/mapper/hd-sys
  Rounding up size to full physical extent 4.00 MiB
  Extending logical volume sys to 13.18 GiB
  Logical volume sys successfully resized
root@Otherland:~# blkid -p /dev/mapper/hd-sys
/dev/mapper/hd-sys: LABEL="root" UUID="55c1f04f-4e15-48e4-9f21-8f71657d0682" VERSION="1.0" TYPE="ext4" USAGE="filesystem"

Revision history for this message
JT (spikyjt) wrote :

The upstream Redhat bug is closed as NOTABUG, but the situation was not discussed there properly. There was an assumption made that the partition was part of a software raid array at some point. This was not the case for me (brand new disk when installed, only ever in one layout), however it was an LVM partition, so resizing it slightly fixed the problem.

This bug report also says "Affects Fedora", but it definitely affects Ubuntu too.

Changed in fedora:
importance: Unknown → Critical
status: Unknown → Invalid
Revision history for this message
Phillip Susi (psusi) wrote :

Is anyone still having this issue and able to help troubleshoot it?

Changed in util-linux (Ubuntu):
status: Confirmed → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.