HPA ( Host Protected Area ) interferes with dmraid

Bug #219393 reported by Thomas on 2008-04-18
72
This bug affects 5 people
Affects Status Importance Assigned to Milestone
dmraid (Ubuntu)
Low
Phillip Susi

Bug Description

Binary package hint: dmraid

I tried to install the hardy release candidate on a fake raid following the instructions on FakeRaidHowto. I have a gigabyte EP35-DSR3 with a ICH9R controller.
The raid array is made of two Seagate 500Go Barracuda 7200.11.
When I do dmraid -ay, I obtain:
ubuntu@ubuntu:~$ sudo dmraid -ay -vvv -d
WARN: locking /var/lock/dmraid/.lock
NOTICE: /dev/sdb: asr discovering
NOTICE: /dev/sdb: ddf1 discovering
NOTICE: /dev/sdb: hpt37x discovering
NOTICE: /dev/sdb: hpt45x discovering
NOTICE: /dev/sdb: isw discovering
NOTICE: /dev/sdb: isw metadata discovered
NOTICE: /dev/sdb: jmicron discovering
NOTICE: /dev/sdb: lsi discovering
NOTICE: /dev/sdb: nvidia discovering
NOTICE: /dev/sdb: pdc discovering
NOTICE: /dev/sdb: sil discovering
NOTICE: /dev/sdb: via discovering
NOTICE: /dev/sda: asr discovering
NOTICE: /dev/sda: ddf1 discovering
NOTICE: /dev/sda: hpt37x discovering
NOTICE: /dev/sda: hpt45x discovering
NOTICE: /dev/sda: isw discovering
NOTICE: /dev/sda: jmicron discovering
NOTICE: /dev/sda: lsi discovering
NOTICE: /dev/sda: nvidia discovering
NOTICE: /dev/sda: pdc discovering
NOTICE: /dev/sda: sil discovering
NOTICE: /dev/sda: via discovering
DEBUG: _find_set: searching isw_bdbafedcfd
DEBUG: _find_set: not found isw_bdbafedcfd
DEBUG: _find_set: searching isw_bdbafedcfd_raid0
DEBUG: _find_set: searching isw_bdbafedcfd_raid0
DEBUG: _find_set: not found isw_bdbafedcfd_raid0
DEBUG: _find_set: not found isw_bdbafedcfd_raid0
DEBUG: _find_set: searching isw_bdbafedcfd_raid1
DEBUG: _find_set: searching isw_bdbafedcfd_raid1
DEBUG: _find_set: searching isw_bdbafedcfd_raid1
DEBUG: _find_set: not found isw_bdbafedcfd_raid1
DEBUG: _find_set: not found isw_bdbafedcfd_raid1
DEBUG: _find_set: not found isw_bdbafedcfd_raid1
NOTICE: added /dev/sdb to RAID set "isw_bdbafedcfd"
DEBUG: checking isw device "/dev/sdb"
ERROR: isw device for volume "raid0" broken on /dev/sdb in RAID set "isw_bdbafedcfd_raid0"
ERROR: isw: wrong # of devices in RAID set "isw_bdbafedcfd_raid0" [1/2] on /dev/sdb
DEBUG: set status of set "isw_bdbafedcfd_raid0" to 2
DEBUG: checking isw device "/dev/sdb"
ERROR: isw device for volume "raid1" broken on /dev/sdb in RAID set "isw_bdbafedcfd_raid1"
ERROR: isw: wrong # of devices in RAID set "isw_bdbafedcfd_raid1" [1/2] on /dev/sdb
DEBUG: set status of set "isw_bdbafedcfd_raid1" to 2
DEBUG: set status of set "isw_bdbafedcfd" to 4
ERROR: no mapping possible for RAID set isw_bdbafedcfd_raid1
INFO: Activating GROUP RAID set "isw_bdbafedcfd"
WARN: unlocking /var/lock/dmraid/.lock
DEBUG: freeing devices of RAID set "isw_bdbafedcfd_raid0"
DEBUG: freeing device "isw_bdbafedcfd_raid0", path "/dev/sdb"
DEBUG: freeing devices of RAID set "isw_bdbafedcfd_raid1"
DEBUG: freeing device "isw_bdbafedcfd_raid1", path "/dev/sdb"
DEBUG: freeing devices of RAID set "isw_bdbafedcfd"
DEBUG: freeing device "isw_bdbafedcfd", path "/dev/sdb"

sudo dmraid -r -d
/dev/sdb: isw, "isw_bdbafedcfd", GROUP, ok, 976773165 sectors, data@ 0

ubuntu@ubuntu:~$ sudo dmraid -s -d
DEBUG: _find_set: searching isw_bdbafedcfd
DEBUG: _find_set: not found isw_bdbafedcfd
DEBUG: _find_set: searching isw_bdbafedcfd_raid0
DEBUG: _find_set: searching isw_bdbafedcfd_raid0
DEBUG: _find_set: not found isw_bdbafedcfd_raid0
DEBUG: _find_set: not found isw_bdbafedcfd_raid0
DEBUG: _find_set: searching isw_bdbafedcfd_raid1
DEBUG: _find_set: searching isw_bdbafedcfd_raid1
DEBUG: _find_set: searching isw_bdbafedcfd_raid1
DEBUG: _find_set: not found isw_bdbafedcfd_raid1
DEBUG: _find_set: not found isw_bdbafedcfd_raid1
DEBUG: _find_set: not found isw_bdbafedcfd_raid1
DEBUG: checking isw device "/dev/sdb"
ERROR: isw device for volume "raid0" broken on /dev/sdb in RAID set "isw_bdbafedcfd_raid0"
ERROR: isw: wrong # of devices in RAID set "isw_bdbafedcfd_raid0" [1/2] on /dev/sdb
DEBUG: set status of set "isw_bdbafedcfd_raid0" to 2
DEBUG: checking isw device "/dev/sdb"
ERROR: isw device for volume "raid1" broken on /dev/sdb in RAID set "isw_bdbafedcfd_raid1"
ERROR: isw: wrong # of devices in RAID set "isw_bdbafedcfd_raid1" [1/2] on /dev/sdb
DEBUG: set status of set "isw_bdbafedcfd_raid1" to 2
DEBUG: set status of set "isw_bdbafedcfd" to 4
*** Group superset isw_bdbafedcfd
--> Subset
name : isw_bdbafedcfd_raid0
size : 314573056
stride : 256
type : stripe
status : broken
subsets: 0
devs : 1
spares : 0
--> Subset
name : isw_bdbafedcfd_raid1
size : 662188288
stride : 128
type : mirror
status : broken
subsets: 0
devs : 1
spares : 0
DEBUG: freeing devices of RAID set "isw_bdbafedcfd_raid0"
DEBUG: freeing device "isw_bdbafedcfd_raid0", path "/dev/sdb"
DEBUG: freeing devices of RAID set "isw_bdbafedcfd_raid1"
DEBUG: freeing device "isw_bdbafedcfd_raid1", path "/dev/sdb"
DEBUG: freeing devices of RAID set "isw_bdbafedcfd"
DEBUG: freeing device "isw_bdbafedcfd", path "/dev/sdb"

ubuntu@ubuntu:~$ sudo dmraid -b
/dev/sdb: 976773168 total, "9QM1LC81"
/dev/sda: 976773168 total, "9QM0W1ZN"
ubuntu@ubuntu:~$ sudo dmraid -n
/dev/sdb (isw):
0x000 sig: " Intel Raid ISM Cfg Sig. 1.2.00"
0x020 check_sum: 147162979
0x024 mpb_size: 704
0x028 family_num: 1310543253
0x02c generation_num: 11983
0x030 reserved[0]: 4080
0x034 reserved[1]: 2147483648
0x038 num_disks: 2
0x039 num_raid_devs: 2
0x03a fill[0]: 2
0x03b fill[1]: 0
0x040 filler[1]: 1310543253
0x0d8 disk[0].serial: " 9QM0W1ZN"
0x0e8 disk[0].totalBlocks: 976771055
0x0ec disk[0].scsiId: 0x0
0x0f0 disk[0].status: 0x53a
0x108 disk[1].serial: " 9QM1LC81"
0x118 disk[1].totalBlocks: 976773168
0x11c disk[1].scsiId: 0x10000
0x120 disk[1].status: 0x53a
0x138 isw_dev[0].volume: " raid0"
0x14c isw_dev[0].SizeHigh: 0
0x148 isw_dev[0].SizeLow: 629145600
0x150 isw_dev[0].status: 0xc
0x154 isw_dev[0].reserved_blocks: 0
0x158 isw_dev[0].filler[0]: 65536
0x190 isw_dev[0].vol.migr_state: 0
0x191 isw_dev[0].vol.migr_type: 0
0x192 isw_dev[0].vol.dirty: 0
0x193 isw_dev[0].vol.fill[0]: 255
0x1a8 isw_dev[0].vol.map.pba_of_lba0: 0
0x1ac isw_dev[0].vol.map.blocks_per_member: 314573064
0x1b0 isw_dev[0].vol.map.num_data_stripes: 1228800
0x1b4 isw_dev[0].vol.map.blocks_per_strip: 256
0x1b6 isw_dev[0].vol.map.map_state: 0
0x1b7 isw_dev[0].vol.map.raid_level: 0
0x1b8 isw_dev[0].vol.map.num_members: 2
0x1b9 isw_dev[0].vol.map.reserved[0]: 1
0x1ba isw_dev[0].vol.map.reserved[1]: 255
0x1bb isw_dev[0].vol.map.reserved[2]: 1
0x1d8 isw_dev[0].vol.map.disk_ord_tbl[0]: 0x0
0x1dc isw_dev[0].vol.map.disk_ord_tbl[1]: 0x1
0x1e0 isw_dev[1].volume: " raid1"
0x1f4 isw_dev[1].SizeHigh: 0
0x1f0 isw_dev[1].SizeLow: 662188032
0x1f8 isw_dev[1].status: 0x8c
0x1fc isw_dev[1].reserved_blocks: 0
0x200 isw_dev[1].filler[0]: 131072
0x238 isw_dev[1].vol.migr_state: 1
0x239 isw_dev[1].vol.migr_type: 1
0x23a isw_dev[1].vol.dirty: 0
0x23b isw_dev[1].vol.fill[0]: 255
0x250 isw_dev[1].vol.map.pba_of_lba0: 314577160
0x254 isw_dev[1].vol.map.blocks_per_member: 662188296
0x258 isw_dev[1].vol.map.num_data_stripes: 2586672
0x25c isw_dev[1].vol.map.blocks_per_strip: 128
0x25e isw_dev[1].vol.map.map_state: 0
0x25f isw_dev[1].vol.map.raid_level: 1
0x260 isw_dev[1].vol.map.num_members: 2
0x261 isw_dev[1].vol.map.reserved[0]: 2
0x263 isw_dev[1].vol.map.reserved[2]: 1
0x280 isw_dev[1].vol.map.disk_ord_tbl[0]: 0x0
0x284 isw_dev[1].vol.map.disk_ord_tbl[1]: 0x1

ubuntu@ubuntu:~$ sudo dmraid -b
/dev/sdb: 976773168 total, "9QM1LC81"
/dev/sda: 976773168 total, "9QM0W1ZN"

It seems it does ot recognize /dev/sda. indeed, if I reboot, the intel option rom says the first disk is not part of the raid array. Then I shut the power down and reboot, and the bios sees it again.

Thomas (thomas-liennard) wrote :

In fact, it seems to be coming from somethingis else. If I simply boot from the live cd and reboot, my raid 0 disk is marked as failed and the raid 1 as degraded. After a cold reboot, the raid 0 is OK and the raid 1 marked as degraded.

jbfoley (jbfoley) wrote :
Download full text (4.0 KiB)

I have the same problem, won't recognize he RAID 1 array on my ICH9R, (Intel X48, Gigabyte X48T-DQ6) warm reboot gives "member offline" for both member disks, but a cold reboot restores them to normal operational status. Using 8.04 Live CD. This RAID works fine under Windows.

A similar problem is discussed in this thread: http://osdir.com/ml/linux.ataraid/2007-09/msg00019.html

Here's my info, let me know if anything else would be useful:

ubuntu@ubuntu:~$ sudo dmraid -ay
ERROR: isw device for volume "RAID" broken on /dev/sdc in RAID set "isw_bfhebdcdbd_RAID"
ERROR: isw: wrong # of devices in RAID set "isw_bfhebdcdbd_RAID" [1/2] on /dev/sdc
ERROR: isw device for volume "RAID" broken on /dev/sdb in RAID set "isw_cajggdbbad_RAID"
ERROR: isw: wrong # of devices in RAID set "isw_cajggdbbad_RAID" [1/2] on /dev/sdb
ERROR: no mapping possible for RAID set isw_bfhebdcdbd_RAID
ERROR: no mapping possible for RAID set isw_cajggdbbad_RAID

ubuntu@ubuntu:~$ sudo dmraid -b
/dev/sdc: 976773168 total, "9QM0KAPS"
/dev/sdb: 976773168 total, "9QM0K9QQ"
/dev/sda: 488397168 total, "9SF03FRH"

# sda is my boot disk, and is not a member of the RAID

ubuntu@ubuntu:~$ sudo dmraid -n
/dev/sdc (isw):
0x000 sig: " Intel Raid ISM Cfg Sig. 1.1.00"
0x020 check_sum: 1693590363
0x024 mpb_size: 480
0x028 family_num: 1574132313
0x02c generation_num: 4466
0x030 reserved[0]: 4080
0x034 reserved[1]: 2147483648
0x038 num_disks: 2
0x039 num_raid_devs: 1
0x03a fill[0]: 2
0x03b fill[1]: 0
0x040 filler[1]: 1574132313
0x0d8 disk[0].serial: " 9QM0K9QQ"
0x0e8 disk[0].totalBlocks: 976771055
0x0ec disk[0].scsiId: 0x20000
0x0f0 disk[0].status: 0x53a
0x108 disk[1].serial: " 9QM0KAPS"
0x118 disk[1].totalBlocks: 976773168
0x11c disk[1].scsiId: 0x30000
0x120 disk[1].status: 0x53a
0x138 isw_dev[0].volume: " RAID"
0x14c isw_dev[0].SizeHigh: 0
0x148 isw_dev[0].SizeLow: 976764928
0x150 isw_dev[0].status: 0xc
0x154 isw_dev[0].reserved_blocks: 0
0x158 isw_dev[0].filler[0]: 1900544
0x190 isw_dev[0].vol.migr_state: 0
0x191 isw_dev[0].vol.migr_type: 0
0x192 isw_dev[0].vol.dirty: 0
0x193 isw_dev[0].vol.fill[0]: 255
0x1a8 isw_dev[0].vol.map.pba_of_lba0: 0
0x1ac isw_dev[0].vol.map.blocks_per_member: 976765192
0x1b0 isw_dev[0].vol.map.num_data_stripes: 3815488
0x1b4 isw_dev[0].vol.map.blocks_per_strip: 128
0x1b6 isw_dev[0].vol.map.map_state: 0
0x1b7 isw_dev[0].vol.map.raid_level: 1
0x1b8 isw_dev[0].vol.map.num_members: 2
0x1b9 isw_dev[0].vol.map.reserved[0]: 2
0x1ba isw_dev[0].vol.map.reserved[1]: 255
0x1bb isw_dev[0].vol.map.reserved[2]: 1
0x1d8 isw_dev[0].vol.map.disk_ord_tbl[0]: 0x0
0x1dc isw_dev[0].vol.map.disk_ord_tbl[1]: 0x1

/dev/sdb (isw):
0x000 sig: " Intel Raid ISM Cfg Sig. 1.1.00"
0x020 check_sum: 2738587652
0x024 mpb_size: 480
0x028 family_num: 2096631103
0x02c generation_num: 6
0x030 reserved[0]: 4080
0x034 reserved[1]: 2147483648
0x038 num_disks: 2
0x039 num_raid_devs: 1
0x03a fill[0]: 2
0x03b fill[1]: 0
0x040 filler[1]: 2096631103
0x0d8 disk[0].serial: " 9QM0K9QQ"
0x0e8 disk[0].totalBlocks: 976773168
0x0ec disk[0].scsiId: 0x20000
0x0f0 disk[0].status: 0x13a
0x108 disk[1].serial: " 9QM0KAPS"
0x118 disk[1].totalBlocks: 976773168
0x11c disk[1...

Read more...

Phillip Susi (psusi) wrote :

The two disks appear to not be part of the same raid set. Try deleting and recreating the raid set in the bios utility.

Phillip Susi (psusi) on 2008-05-01
Changed in dmraid:
assignee: nobody → psusi
status: New → In Progress
jbfoley (jbfoley) wrote :

I tried that the first time I saw this happen, because the "offline member" message when I rebooted made me think the raid had been broken by dmraid. On a later attempt, (when I actually had some data on that array to test) I tried a cold boot, and both Windows and the bios recognized the array as being healthy, with all data intact. (This array is on a new build, and it had always been built from scratch using this same board/bios) I'm willing to rebuild again to help figure this out, but only if there's a different method to try, as right now it would involve moving around a lot of data.

For some reason, what dmraid sees as broken, Intel's windows driver thinks is just fine, with both disks participating in the mirror and healthy. This is what makes me think that either Intel has changed the way it marks these RAIDs, or perhaps the chipset driver is slightly off, so that it reads these incorrectly. I *have* rebuilt this array more than once, so it *could* be reading one of those sets of info from an earlier attempt, but it threw up the same error after the very first build, and these were brand new drives at that point.

There was another bug with the ethernet controller on this board that caused Windows to turn the controller off on shutdown, and the ubuntu driver did not know how to turn it back on, so you needed to enable wake on lan in Windows or do a cold boot to turn ethernet back on. The first part of this bug seems almost like that in reverse, with the disks being offline after a warm boot from ubuntu.

Phillip Susi (psusi) wrote :

It looks like Intel has some bugs. Looking closely at the data, both disks disagree on the size of the primary disk. It looks like the bios may have a bug when creating the array which causes it to record the size incorrectly in one of the disks, which results in the family signature being different on both disks, and the windows driver apparently doesn't bother to verify that they match as they should.

Try updating your bios and recreating the array. If that doesn't help, you may need to file a bug report with Intel.

Watchwolf (watchwolf) wrote :

You re not alone, I have the same problem.
The raid was correctly detected with dmraid R13 but not usable (there was no device for the partition). The R14 fixed this problem but now, only 1 disk is detected by the raid and dmraid seems broken the raid.

masa (masa-betcher-online) wrote :

same here!

MOBO: Gigabyte GA-EP35-DS4 (Bios F2) ( http://www.gigabyte.de/Support/Motherboard/Driver_Model.aspx?ProductID=2678 )
CHIPSET: ICH9R Chipset
RAID: RAID0 (SATA) + RAID1(SATA)
-------------------------

I have 2 raid arrays, RAID1 and RAID0. Seems that RAID1 works and after reboot it dosen't goes offline!
but the RAID0 do this :-(

And is there already an workaround to fix this problem?

masa (masa-betcher-online) wrote :
Download full text (14.6 KiB)

ubuntu@ubuntu:~$ sudo dmraid -ay -vvv -d
WARN: locking /var/lock/dmraid/.lock
NOTICE: skipping removable device /dev/sdf
NOTICE: skipping removable device /dev/sdg
NOTICE: skipping removable device /dev/sdh
NOTICE: skipping removable device /dev/sdi
NOTICE: skipping removable device /dev/sdj
NOTICE: /dev/sde: asr discovering
NOTICE: /dev/sde: ddf1 discovering
NOTICE: /dev/sde: hpt37x discovering
NOTICE: /dev/sde: hpt45x discovering
NOTICE: /dev/sde: isw discovering
NOTICE: /dev/sde: jmicron discovering
NOTICE: /dev/sde: lsi discovering
NOTICE: /dev/sde: nvidia discovering
NOTICE: /dev/sde: pdc discovering
NOTICE: /dev/sde: sil discovering
NOTICE: /dev/sde: via discovering
NOTICE: /dev/sdd: asr discovering
NOTICE: /dev/sdd: ddf1 discovering
NOTICE: /dev/sdd: hpt37x discovering
NOTICE: /dev/sdd: hpt45x discovering
NOTICE: /dev/sdd: isw discovering
NOTICE: /dev/sdd: isw metadata discovered
NOTICE: /dev/sdd: jmicron discovering
NOTICE: /dev/sdd: lsi discovering
NOTICE: /dev/sdd: nvidia discovering
NOTICE: /dev/sdd: pdc discovering
NOTICE: /dev/sdd: sil discovering
NOTICE: sil: areas 1,2,3,4[4] are valid
NOTICE: /dev/sdd: sil metadata discovered
/dev/sdd: "sil" and "isw" formats discovered (using isw)!
NOTICE: /dev/sdd: via discovering
NOTICE: /dev/sdc: asr discovering
NOTICE: /dev/sdc: ddf1 discovering
NOTICE: /dev/sdc: hpt37x discovering
NOTICE: /dev/sdc: hpt45x discovering
NOTICE: /dev/sdc: isw discovering
NOTICE: /dev/sdc: isw metadata discovered
NOTICE: /dev/sdc: jmicron discovering
NOTICE: /dev/sdc: lsi discovering
NOTICE: /dev/sdc: nvidia discovering
NOTICE: /dev/sdc: pdc discovering
NOTICE: /dev/sdc: sil discovering
NOTICE: sil: areas 1,2,3,4[4] are valid
NOTICE: /dev/sdc: sil metadata discovered
/dev/sdc: "sil" and "isw" formats discovered (using isw)!
NOTICE: /dev/sdc: via discovering
NOTICE: /dev/sdb: asr discovering
NOTICE: /dev/sdb: ddf1 discovering
NOTICE: /dev/sdb: hpt37x discovering
NOTICE: /dev/sdb: hpt45x discovering
NOTICE: /dev/sdb: isw discovering
NOTICE: /dev/sdb: isw metadata discovered
NOTICE: /dev/sdb: jmicron discovering
NOTICE: /dev/sdb: lsi discovering
NOTICE: /dev/sdb: nvidia discovering
NOTICE: /dev/sdb: pdc discovering
NOTICE: /dev/sdb: sil discovering
NOTICE: /dev/sdb: via discovering
NOTICE: /dev/sda: asr discovering
NOTICE: /dev/sda: ddf1 discovering
NOTICE: /dev/sda: hpt37x discovering
NOTICE: /dev/sda: hpt45x discovering
NOTICE: /dev/sda: isw discovering
NOTICE: /dev/sda: isw metadata discovered
NOTICE: /dev/sda: jmicron discovering
NOTICE: /dev/sda: lsi discovering
NOTICE: /dev/sda: nvidia discovering
NOTICE: /dev/sda: pdc discovering
NOTICE: /dev/sda: sil discovering
NOTICE: /dev/sda: via discovering
DEBUG: _find_set: searching isw_jejghfiaa
DEBUG: _find_set: not found isw_jejghfiaa
DEBUG: _find_set: searching isw_jejghfiaa_RAID1
DEBUG: _find_set: searching isw_jejghfiaa_RAID1
DEBUG: _find_set: not found isw_jejghfiaa_RAID1
DEBUG: _find_set: not found isw_jejghfiaa_RAID1
NOTICE: added /dev/sdd to RAID set "isw_jejghfiaa"
DEBUG: _find_set: searching isw_jejghfiaa
...

hello,
sorry for my english :-(
i found why the raid was displayed brocken.
it seems that the raid information is stored on the HDD, and if u rebuild / buil new RAID-Array the "old" info is still stored on one of the HDD.
i just formated the members of the array (mkfs) , reboot then i created a new array.
After booting into liveCD it works fine, and seems the message "Member Offline" after warm boot is gone :-)

BUT over 1TB is away :-(

root@ubuntu:~# dmraid -s
/dev/sdd: "sil" and "isw" formats discovered (using isw)!
/dev/sdc: "sil" and "isw" formats discovered (using isw)!
*** Group superset isw_jejghfiaa
--> Active Subset
name : isw_jejghfiaa_RAID1
size : 160831744
stride : 128
type : mirror
status : ok
subsets: 0
devs : 2
spares : 0
*** Group superset isw_ggadjggdi
--> Active Subset
name : isw_ggadjggdi_RAID0
size : 1953536512
stride : 32
type : stripe
status : ok
subsets: 0
devs : 2
spares : 0

German forum : http://forum.ubuntuusers.de/topic/173813/

regards masa

Same problem here with a P35C-DS3R (rev2.0).

dmraid works fine with gentoo liveCD , with exactly same version of dmraid and mapper.
So in my mind, the raid array is badly initialized by something else...
Maybe it comes from another module (not loaded in gentoo case) ?

jbfoley (jbfoley) wrote :

Masa, what you are seeing seems to agree with my most recent test.

I once again deleted the array in BIOS and re-created it, only this time, I named it "NewRAID1"

When I ran dmraid -n it showed the first disk as being a part of "NewRAID1" and the second as being a part of "RAID1," which was the name of the old array.

When I get home, I will try breaking the array, completely reformatting both, and then rebuilding it to see what happens.

Sorry for the slow response, I have been on the road for the last couple of weeks.

Intel's bug report process is a joke, for those of you who are interested. As far as I can tell, there is no way to report a driver or BIOS bug aside from going through the motherboard manufacturer. Glad I didn't get an Intel-made board.

My guess is that this is really a bug in the driver/BIOS, as others have suggested. I will take it up with Gigabyte when time permits and post the results here (if any.)

Phillip Susi (psusi) wrote :

When posting to a bug someone else filed to say "me too" you really need to try to provide the same detailed information that others have in the bug report to be useful. In this case, that would mean the output of dmraid -n and dmraid -ay -vvvv -dddd.

masa (masa-betcher-online) wrote :

 jbfoley, its a bug in ubuntu dmraid modules or somting else buts its in ubuntu!

how Sebounet says if you are booting COLD in to a gentoo live cd it works fine, fdisk shows after spliting the disk the created devices unter /dev/maper ...
but ubuntu-dmraid counts the devices up an reads the parti-tables from other device?

scenario if you are booting in to ubunto the raid shows as brocken.ok, reboot(Warm) recreate the raid (BIOS,INTEL RAID MANAGER ) boot in to ubuntu and the raid seems to working, ok, second reboot(WARM) into the gentoo live CD, gentoo dmraid shows the saim raid, ok it works ??? NO

now turn the pc OFF, not only reboot. gentoo Live cd shows the old raid ...

there are somwhere cout-vars, which counting wrong by ubuntu-dmraid, or other stuff, "the count variables" for the raids are in the itel driver?

am today in the evenings back at home , and i can post loadet modules, dmraid version (gentoo) where it works fine.

however gentoo rocks :-)

sorry for my ugly english :-(

jbfoley (jbfoley) wrote :

I tried breaking the array, reformatting both drives with mkfs (ext2) and then rebuilding the array. This time dmraid would only see one drive of the mirror, and not the other one. I'm including my info dump in a text file.

Perhaps this version of the Intel driver only needs to mark one drive? Seems like it wouldn't work if one drive failed in that case. Other possibility is that it's putting the mark in a different spot on the second drive, but that also seems unlikely. It really looks to me like there is a bug preventing dmraid from finding the correct marker on the second drive.

I should mention that I booted ubuntu immediately after the array build in this case, but after I did all of the stuff you seen in the text dump, I cold booted to windows and the array came up as healthy and happy. I then initialized it for windows and formatted it ntfs, then rebooted and tried ubuntu again, but there was no change, except that the ext2 filesystem that had been visible on each individual drive is now inaccessible, and only one 500 gig device shows, instead of both, but it is also unmountable.

Also, watchwolf, I am using the rc14 version of dmraid, so it looks like that is not the fix for my problem:
joe@Cosmos:~$ dmraid -V
dmraid version: 1.0.0.rc14 (2006.11.08)
dmraid library version: 1.0.0.rc14 (2006.11.08)
device-mapper version: unknown

jbfoley (jbfoley) wrote :

I have a gentoo live CD lying around somewhere, so I'll give that a try tonight.

Phillip Susi (psusi) wrote :

Hrm.... it sounds like the bios may be writing the signature to a different location on the disk each time, and dmraid becomes messed up when it finds the wrong one. jfboley, would you wipe the disk clean, recreate the raid array, and then dump the last 2 megs of each disk and post them here?

Zero the last 2 megs of each disk, repeat for each disk in the array:

dd if=/dev/zero of=/dev/sda seek=976769072

Boot into the bios and verify that it no longer thinks the drives are part of an array ( since we just wiped out the signature ), and if not, create a new array, then dump the last 2 megs and attach here:

dd if=/dev/sda skip=976769072 | bzip2 -c > sda.bz2

Replace sda and repeat for each drive in the array.

You did say that if you cold boot into ubuntu, dmraid works correctly right, but not if you warm boot? After zeroing the disk, be sure to cold boot into the bios, and cold boot again back to Ubuntu after creating the array. Once you have dumped the last two megs of the disks, see if dmraid -ay correct activates them, then warm boot and try again. Note the output of fdisk -l after a cold boot, and again after a warm, and see if they differ.

jbfoley (jbfoley) wrote :

I will try this procedure this evening and post the results.

Do you want me to run this procedure on sda as well? In my current configuration, sda is not a RAID member, and it serves as my boot disk. sdb and sdc are the members of the RAID.

One correction, dmraid does not work correctly after a cold boot. Unfortunately, I've never been able to get dmraid to work on this machine. The cold boot has to do with getting the bios and the windows drivers to work correctly. If I boot Ubuntu, whether I try to use dmraid or not, then restart from Ubuntu (warm boot,) the bios shows both disks of the array as "offline members" and windows does not see them properly either. It is only when I cold boot and go directly to windows that the RAID works.

jbfoley (jbfoley) wrote :

Phillip,

I have attached the results of your suggested procedure. I took the chance to look at them, and found that while the sig data does exist on both drives, it is also located in very different spots on each. This seems very strange to me, but also suggests an easy workaround and fix.

For now, I'm going to try moving both signatures to the same place. Hopefully, there is a way to make dmraid search for this data on both drives so that future users don't have to deal with this. I hope that the position is not supposed to be determined by the disk serial or something equally cryptic. I will post the results.

jbfoley (jbfoley) wrote :

I moved the signature data on sdb to the same offset as that on sdc (and zeroed out its old position.) (I moved the data from sdb to sdb, even though it looked like the sigs on the two drives were identical.) This resulted in the BIOS seeing the RAID as degraded, with one member disk, and one non-member, but healthy disk.

I booted without letting the BIOS make any changes, and dmraid also did not like the new set. I have attached that info in a text file. I did not boot windows at any time during all of this, and all boots were cold ones.

From the results of dmraid -n it looks like there is some data for the RAID stored in other parts of the disk. I don't know enough to mess with it further, so I'll put it back the way it was and wait to hear back.

p.s. thanks for all your help, Phillip.

Phillip Susi (psusi) wrote :

Have you tried updating your bios yet? I also notice that the version recorded in the signature is rather old.

According to Intel, the metadata is supposed to reside in the last two sectors of the disk. Based on the incorrect position and the incorrect disk size I noted that was recorded before, it appears that your bios has a bug where it sometimes can not correctly determine the size of that disk.

I have a suspicion that the bios may be issuing commands to hide the tail end of the one disk, which the Linux kernel resets, causing the metadata to appear in the wrong position and the bios and windows don't recognize the disk after a warm boot because it does not reapply the hide command.

Another strange thing is that originally the signature was in the correct place and dmraid found it, but the disks did not agree on the size. What changed between the time you originally created the array, and when you recreated it with the signature now in the wrong place? Odd.

Phillip Susi (psusi) wrote :

I have done some more reading of that thread on the ataraid list you linked before, and the bug reports they reference, and it seems my initial suspicions were correct. Your bios is using the Host Protected Area feature to reduce the size of the disk, and the kernel is resetting that. Could you try adding "ata_ignore_hpa=0" to your kernel command line at boot?

If you check your dmesg, you should see something like this now:

[ 60.453908] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 60.456006] ata1.00: Host Protected Area detected:
[ 60.456007] current size: 293044655 sectors
[ 60.456008] native size: 293046768 sectors
[ 60.457150] ata1.00: native size increased to 293046768 sectors

I think that adding that parameter should get rid of the last line and the metadata will appear at the correct location.

jbfoley (jbfoley) wrote :

I'll have to inform Gigabyte of the problem, then. The BIOS I have is F4, which is the release version, but also the most recent version available for this board. With such a new product, I guess that's the danger. In the meantime, I'd like to see this work at least once, just so we can confirm that this is the only issue. I seem to recall that there was a procedure for specifying the position of the signature to dmraid. Is that true, and if so, how would one do it?

jbfoley (jbfoley) wrote :

Thanks, I didn't see your last message before I posted, there. I'll give that kernel parameter a try this evening.

I still think Gigabyte needs to act on this, especially if they've got an old version of the RAID BIOS. I think I will send them to this thread directly.

Sebounet (sebastien-grand3) wrote :
Download full text (3.3 KiB)

Currently running on kubuntu live CD 8.04.

Mobo : P35C-DS3R (rev2.0) with latest BIOS (F11e)
( First version of bios are AWFULL , nothing worked correctly... )

I added "ata_ignore_hpa=0" at boot ... But it did not help :(

still have :

root@ubuntu:~# dmraid -ay
ERROR: isw device for volume "RAID0" broken on /dev/sdb in RAID set "isw_cdgibffddf_RAID0"
ERROR: isw: wrong # of devices in RAID set "isw_cdgibffddf_RAID0" [1/2] on /dev/sdb
ERROR: isw device for volume "Volume0" broken on /dev/sda in RAID set "isw_echgefjfjf_Volume0"
ERROR: isw: wrong # of devices in RAID set "isw_echgefjfjf_Volume0" [1/2] on /dev/sda

dmesg gives :

[ 365.835593] device-mapper: table: 254:0: linear: dm-linear: Device lookup failed
[ 365.835599] device-mapper: ioctl: error adding target to table
[ 365.836529] device-mapper: table: 254:0: linear: dm-linear: Device lookup failed
[ 365.836533] device-mapper: ioctl: error adding target to table
[ 381.580072] device-mapper: table: 254:0: linear: dm-linear: Device lookup failed
[ 381.580076] device-mapper: ioctl: error adding target to table
[ 381.580964] device-mapper: table: 254:0: linear: dm-linear: Device lookup failed
[ 381.580967] device-mapper: ioctl: error adding target to table

grepping in dmesg with "ata" , I noticed :

[ 0.000000] Kernel command line: BOOT_IMAGE=/casper/vmlinuz file=/cdrom/preseed/kubuntu.seed boot=casper initrd=/casper/initrd.gz quiet splash -- ata_ignore_hpa=0 locale=fr_FR console-setup/layoutcode=fr console-setup/variantcode=oss
[ 86.016927] Memory: 3621848k/4194304k available (2164k kernel code, 45784k reserved, 1007k data, 364k init, 2751360k highmem)
[ 86.016935] .data : 0xc031d1bd - 0xc0418dc4 (1007 kB)
[ 89.211303] libata version 3.00 loaded.
[ 90.988408] ata1: SATA max UDMA/133 abar m2048@0xfa102000 port 0xfa102100 irq 219
[ 90.988411] ata2: SATA max UDMA/133 abar m2048@0xfa102000 port 0xfa102180 irq 219
[ 90.988412] ata3: SATA max UDMA/133 abar m2048@0xfa102000 port 0xfa102200 irq 219
[ 90.988414] ata4: SATA max UDMA/133 abar m2048@0xfa102000 port 0xfa102280 irq 219
[ 90.988416] ata5: SATA max UDMA/133 abar m2048@0xfa102000 port 0xfa102300 irq 219
[ 90.988418] ata6: SATA max UDMA/133 abar m2048@0xfa102000 port 0xfa102380 irq 219
[ 91.307190] ata1: SATA link down (SStatus 0 SControl 300)
[ 91.945515] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 91.958141] ata2.00: HPA unlocked: 1250261615 -> 1250263728, native 1250263728
[ 91.958145] ata2.00: ATA-7: SAMSUNG HD642JJ, 1AA01109, max UDMA7
[ 91.958147] ata2.00: 1250263728 sectors, multi 16: LBA48 NCQ (depth 31/32)
[ 91.964528] ata2.00: configured for UDMA/133
[ 92.280638] ata3: SATA link down (SStatus 0 SControl 300)
[ 92.599800] ata4: SATA link down (SStatus 0 SControl 300)
[ 93.238126] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 93.250737] ata5.00: HPA unlocked: 1250261615 -> 1250263728, native 1250263728
[ 93.250740] ata5.00: ATA-7: SAMSUNG HD642JJ, 1AA01109, max UDMA7
[ 93.250743] ata5.00: 1250263728 sectors, multi 16: LBA48 NCQ (depth 31/32)
[ 93.257139] ata5.00: configured for UDMA/133

I have numerous dmesg like this one :

[ 195.966487] ...

Read more...

Sebounet (sebastien-grand3) wrote :

I read here :

http://ubuntuforums.org/showthread.php?p=4859992

that loading libata module this way :

modprobe libata ignore_hpa=0

..may help.
But how to add this option while booting on liveCD ?

Sebounet (sebastien-grand3) wrote :

No other solution than extracting the Filesystem.squashfs -> modifying -> recreate an ISO ?
( long way to go for just a try ... )

Phillip Susi (psusi) wrote :

That's what I was just trying to figure out Sebounet... but in your case, your actual array is also broken since both disks were given different signatures when created. You might try recreating the array and see if they get the same signature this time. To work properly they need the same signature, AND you will need the HPA fix. I would suggest that you contact your motherboard manufacturer and open a trouble ticket with them and see if you can figure out why the bios is even protecting part of the disk in the first place, and if there is a way you can tell it not to.

jbfoley (jbfoley) wrote :

The kernel parameter does not appear to have had the intended effect. I added this entry to menu.lst in /boot/grub and cold-booted to it:

title Ubuntu 8.04, kernel 2.6.24-16-generic (RAID fix)
root (hd0,4)
kernel /boot/vmlinuz-2.6.24-16-generic root=UUID=ec648d6c-188d-4513-b3d5-0a534bd5e377 ro splash ata_ignore_hpa=0
initrd /boot/initrd.img-2.6.24-16-generic

Just appended ata_ignore_hpa=0 to then end of the kernel line. It looks like the command was taken. From dmesg:

[ 0.000000] Kernel command line: root=UUID=ec648d6c-188d-4513-b3d5-0a534bd5e377 ro splash ata_ignore_hpa=0

but when it got to my second disk (which is the only one it was doing hpa unlocking on before):

[ 47.817330] ata3.00: HPA unlocked: 976771055 -> 976773168, native 976773168

I have included a partial dmesg from this attempt. Only one of the drives seems to get the hpa on it. I also tried several other ways to add the parameter, but none have prevented the line above from appearing.

Also, Sebounet, I'd like to try your method. I'm not booting to a live cd, but I'm a bit new to ubuntu, so from a hard disk install, how and where would one add the modprobe libata ignore_hpa=0 line? /etc/modprobe.d has many files in it, and I don't just want to throw it in anywhere.

masa (masa-betcher-online) wrote :

hello, me again whit the bad english :-)

jbfoley i tried to add this to the boot line in /boot/grub/menu.lst
but after cold reboot its gone, /boot/grub/menu.lst was rewritten by ?

ok looking in the dmesg
it appears there
[ 0.000000] Command line: root=UUID=c62d8a18-ea63-45d2-83c7-44a6d197f299 ro quiet splash ata_ignore_hpa=0

here is a cut of the dmesg http://ubuntuusers.de/paste/225160/
have a look at the line 32 => Faking ... ?

tried to add this line to /etc/modules no changes :-(
libata ignore_hpa=0

how can change this para. before libata got loaded???
reg. masa

Phillip Susi (psusi) wrote :

Ok, because libata is loaded as a module rather than built into the kernel, you have to add the line:

options libata ignore_hpa=0

to /etc/modprobe.d/libata-options. After adding the line, you will need to run sudo update-initramfs -u.

If you are booting from the livecd, then you need to add break=top to the kernel command line and when you get to the prompt, type:

echo options libata ignore_hpa=0 > /etc/modprobe.d/libata-options
exit

Sebounet (sebastien-grand3) wrote :

And the winner is : Philip Susi :)

Thanks, it works for me !

I have other problems with installer : Installer displays bizarre patition table, but nothing alarming. It can't format the ext3 partition. I did it manually, without a problem, and asked installer not ot format ... But now, it complains about system files that can not be removed, even when starting with an empty, freshly formated, partition.
But this is another ... story.
Because the trick makes dmraid works properly.

One more time : Thanks Philip !

jbfoley (jbfoley) wrote :

Thanks Phillip! That worked for me as well!

I can now sudo mount /dev/dm-1 /media/RAID and read and write to it without breaking the array.

The issue where the RAID is shown as broken after a soft reboot is resolved as well.

Some of the research that I did on this issue since my last post indicates that this HPA could have been created by the "Xpress recovery" feature of this motherboard. This is supposed to create a system restore partition at the end of a hard disk, much like is found on some laptops, and some sources on forums indicate that these partitions are commonly hidden by the BIOS using HPA. I was trying to find a way to use the feature to change which drive it uses to apply HPA. The funny thing is that this board does not support the feature at all when the RAID controller is turned on, so I could not even load the configuration utility to see if it had such an option. Other people with similar features on their motherboards might need to use this same workaround.

Thanks again, Phillip, for your excellent detective work and problem solution.

Pinocheckio (jannesl) wrote :

i´ve got th same problem with my Abit ip35 pro, ICH9R
I want to make a Raid 5 array, and after hours i managed, with the mdraid45 module (that appears only to be working in kernel 2.6.24.16) to build the array and format in ntfs.
So basically i want it to be accessible in linux and windows, but it doesn´t have to be bootable.
So i tried some of these things above, but i guess libata ignore_hpa=0 will not change anything because i´m not booting from it, am I right?
But the problems seems the same, i tried it formatting in windows with partition magic and in ubuntu, but as sooon as I shut down and power up again those drives are offline.
So how to fix this?? I can boot to windows and ubuntu without problems, sometimes i´ve got to manually edit grub because hd numbers are changed or something...
But I really want this to be working so i can share media-files between windows and linux

my dmraid-ay:
ERROR: isw device for volume "Data" broken on /dev/sdb in RAID set "isw_ddfgjgjdhh_Data"
ERROR: isw: wrong # of devices in RAID set "isw_ddfgjgjdhh_Data" [1/3] on /dev/sdb
ERROR: isw device for volume "Data" broken on /dev/sda in RAID set "isw_cdjicdbgha_Data"
ERROR: isw: wrong # of devices in RAID set "isw_cdjicdbgha_Data" [2/3] on /dev/sda
ERROR: isw device for volume "Data" broken on /dev/sdc in RAID set "isw_cdjicdbgha_Data"
ERROR: isw: wrong # of devices in RAID set "isw_cdjicdbgha_Data" [2/3] on /dev/sdc

Pinocheckio (jannesl) wrote :

Ok, it works now!! i just had to read more carefully :) Thanks all!
But I 've another question, what happens if I upgrade to new kernel version? Because I think this dmraid 45 patch is only available kor kernel 2.6.24.16?

Phillip Susi (psusi) wrote :

It seems that this will not be fixed. The old IDE driver was always broken in that it ignored the HPA. When this was fixed in libata, it caused breakage for people who had already formatted their disks with the old driver, then upgraded, and were no longer able to access the whole disk. For this reason, Ubuntu has diverged from upstream and reverted to the old broken behavior by default. The workaround is to set the ignore_hpa parameter to explicitly direct the driver to respect the HPA, or to use a tool to permanently remove the HPA from the drive.

Changed in dmraid:
importance: Undecided → Low
status: In Progress → Won't Fix
Pinocheckio (jannesl) wrote :

Mhm well my array is broken again, after updating initramfs for something that had nothing to do with it :(
So when i try to rebuild, it takes only 1 or 2 disks in the array, also ubuntu has problems with reading to right partition table from one disk. So i did a very good format this time that took a few hours for each disk. So i hope after rebuilding it it will be better.
But I have an other problem, i can't find libata-options anymore in /etc/moprobe.d after doing modprobe libata ignore_hpa=0, so it writes this line to an empty file, but there must be other things in it to work i guess?

Too bad nobody is providing decent raid-drivers for linux.

Bambi (bamthecute) wrote :

Hi,

I actually have the same problem like Pinocheckio, I can't find the libata-options file anywhere...
Anyone have another idea on how to bypass the problem?

jbfoley (jbfoley) wrote :

I didn't have the file, either. All you have to do is create the empty (and writable) file libata-options in /etc/modprobe.d/ and then open it with your favorite text editor and add "options libata ignore_hpa=0". That's the only line that needs to be in there.

Once you've done that, the "sudo update-initramfs -u" command is also required to make this work.

masa (masa-betcher-online) wrote :

Hello,
im live on ubuntu 8.04.1 (hardy 64-Bit).

how i am booting into the live cd:

1. break=top (kernel startup line)
2. echo options libata ignore_hpa=0 > /etc/modprobe.d/libata-options
3. Seting the option in Synaptic universe
4. refreshing the packages
5. installing dmraid
6. starting dmraid

i remember that dmraid wonted start in ubuntu 8.04, but now he starts the raid finaly, but...

1. not all the partitions appears in /dev/mapper (devices are there).
2. /etc/modprobe.d/libata-options doesn't exist

isw_cbjdigdega_vb is a raid 0 (partitions?)
isw_dgeabjbggg_vc is a raid 1 array (looks ok)

output see in the attachment.

somb. a suggestion ?

reg.
masa

masa (masa-betcher-online) wrote :

Its real mystical.

im just hot plugged off the two hdds (raid 0) devices ( /dev/mapper/isw_dgeabjbggg_vc )

1. damraid -an
2. plug off
3. dmraid -ay

now is isw_cbjdigdega_vb back !

root@ubuntu:~# ls /dev/mapper/
control isw_cbjdigdega_vb1 isw_cbjdigdega_vb3 isw_cbjdigdega_vb6
isw_cbjdigdega_vb isw_cbjdigdega_vb2 isw_cbjdigdega_vb5

the unpluged array was repaired by the matrixstorage software (windows).

now i will try to change the sata-ports, maybe it helps.

reg.
masa

Phillip Susi (psusi) wrote :

Masa, your issue does not appear to be related to this bug.

masa (masa-betcher-online) wrote :

hello Philip Susi,
i posted it becouse of email from "fincan"

[qote]
well ubuntu 8.04.1 is here, could you try it to check for the bug plz? I
dont have chance to try it now. indeed I wonder the result.
[/qote]

i was just a test under 8.04.1.

if im booting with only one raid combination it seems to works.

@ Pinocheckio do you have multiple raids ?
Would you be good enough to plug the two raids off ( to see which works)

im curent landing in busybox if the two raids are pluged in. :-(

reg.
masa

jbfoley (jbfoley) wrote :

I have found a utility that will avoid you having to use the workaround for HPA. This utility will allow you to remove the HPA from your hard drive permanently in most cases. You have to have the hard drive on a SATA port that is in AHCI mode, not in RAID mode when you run this bootdisk and utility. My RAID was one with redundancy, and my motherboard has an extra separate SATA controller to add two more ports, so what I did was move the drive with the HPA over to that extra port and put that in AHCI mode, then used the utility to remove the HPA. After that, I put the drive back, and let the RAID rebuild, without any loss of data. This would also work if you put the drive on a separate computer with an AHCI controller. I don't know what would happen if you changed your RAID controller to AHCI mode, but I suspect it would break the array permanently, so make a backup if you have to do this.

Here is the utility:

http://www.hdat2.com/

This solved all my problems with Linux and unmodified Live CDs, too. Now it all just works. Anyone else with a Gigabyte board that has Xpress Recovery like mine should be careful never to turn that on again, or you'll have the HPA back.

I have read that there are some Dell computers out there that also have code in their boot sector to make them re-create the HPA even if you delete it. If you have one of those, you will have to do more work to make this permanent.

Pino (giuseppesantaniello) wrote :

Hi, sorry for my english
I have the same problem, won't recognize he RAID 0 array on my ICH9R, (Gigabyte GA-P35-DS3R) warm reboot gives "member offline" for both member disks, but a cold reboot restores them to normal operational status. Using 8.10 Alpha 4 dmraid works fine.

In official release the libata bug resolution (add line options libata ignore_hpa=0 and run sudo update-initramfs -u) doesn't work for me, I've also tried adding break=top with the LiveCD but at the prompt initramfs in busybox my USB keyboard doesn't work (with the BIOS legacy usb option activated)
Is possible obtain the same result only modifying the file option in /etc/modprobe.d/options like below:

cat /etc/modprobe.d/options
# Enable double-buffering so gstreamer et. al. work
options quickcam compatible=2

# Default hostap to managed mode
options hostap_pci iw_mode=2
options hostap_cs iw_mode=2

# Stop auto-association.
# LP: #264104
options ipw2200 associate=0

# XXX: Ignore HPA by default. Needs to be revisted in jaunty
options libata ignore_hpa=1 --> 0

I've changed this value to 0 but after system reboot nothing changed....

Phillip Susi (psusi) wrote :

Did you run update-initramfs after editing the options?

Pino (giuseppesantaniello) wrote :

Ops I forgot ! Now it all just works :-) Thanks

P3P (p3p) wrote :

System:
* Ubuntu 8.10 amd64 (not installed yet)
* Intel Software Raid from ICH9R southbridge, ASUS P5K-E motherboard
* 4 hard disks with two raid arrays (one RAID0 array and one RAID5 array).

I am trying to install Ubuntu Intrepid on the fakeraid but none of the solutions worked for me.
Dmraid fails with this error (repeated 8 times, one time per disk and array):
ERROR: isw device for volume "zerovol" broken on /dev/sda in RAID set "isw_baeaijeeda_zerovol"
ERROR: isw: wrong # of devices in RAID set "isw_baeaijeeda_zerovol" [8/4] on /dev/sda

I have tried these:

* Solution to the live CD. I have tried booting with break=top and adding /etc/modprobe.d/libata-options and with /etc/modprobe.d/options but dmraid prints the same error message.

* Using a USB persistent startup disk. I have changed /etc/modprobe.d/libata-options and /etc/modprobe.d/options and rebooted, the changes in that files are permanent, but dmraid fails. So I have followed a workaround to be capable of executing "update-initramfs" without "You are executing from a live CD" limitation. (see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/292159)
I have copied the output initrd.gz to the flash USB stick on /casper/initrd.gz, but dmraid still fails.

Any ideas on how to install Ubuntu to a RAID5 without any other hard disk outside the raid arrays?
I have an older Ubuntu 8.04 already installed and working on the raid0 and capable of read/write the raid5 array, it could help.

Regards.

NOTE: with Ubuntu 8.04 release candidate dmraid works wells, but raid5 kernel module had incorrect name, so I could not install. With Ubuntu 8.04.1 module had correct name, but dmraid did not work well, so I could not install. With Ubuntu 8.10 kernel module is correct, but dmraid has another problem. I cannot remember which stupid think failed in Ubuntu 7.10.
Conclusion: Fakeraid support is too poor, raid is wide extended even in desktop users. We need better support please!!

P3P (p3p) wrote :

I have installed the Ubuntu 8.10 system in the RAID5 with debootstrap from the Ubuntu installed on the RAID0.

But Ubuntu 8.10 intrepid can not boot because dmraid have this error:
ERROR: isw device for volume "zerovol" broken on /dev/sda in RAID set "isw_baeaijeeda_zerovol"
ERROR: isw: wrong # of devices in RAID set "isw_baeaijeeda_zerovol" [8/4] on /dev/sda

I have added libata hpa option and updated initramfs, but dmraid has the same error message.

My Ubuntu 8.04 install can read/write the arrays and dmraid works well, but Ubuntu 8.10 cannot.

Phillip Susi (psusi) wrote :

P3P, your issue appears to have nothing to do with the Host Protected Area. It looks like your problem is bug #292302.

Matthias (dupont-matthias) wrote :

Hello,

i've got the ame issue with Jaunty Alpha 5, i tried the solution to add "options libata ignore_hpa=0" in /etc/modprobe.d/options or libata-options without success.

Before i did the same manipulation on intrepid ibex and i worked without problem.

Has someone resolved the issue on Jaunty ?

Thanks in advance...

PS: i did not forget to run update-initramfs -u each time)

Matthias (dupont-matthias) wrote :

Hello,

i found a solution to resolve the issue on jaunty as the previous trick didn't work.

I added "libata.ignore_hpa=0" to the kernel command line and it seems to work, my raid array isn't broken on shut down.

Kjow (antispammoni) wrote :

@Matthias:
"I added "libata.ignore_hpa=0" to the kernel command line"

Great! It worked very well for me on Ubuntu 9.04b (but grub fail to start after install)

masa (masa-betcher-online) wrote :

@all it works by adding "libata.ignore_hpa=0" to the kernel command line
after this you have the access to the raid in the live CD so you can install ...

after installing you have to install dmraid in you new system (chrooting) and adding the hpa=0 to the kernel line and in modprobe.d

http://forum.ubuntuusers.de/topic/dmraid-ich9-gigabyte-ga-ep35-ds4/#post-1400431

magnum696 (plamannajr) wrote :

Hello,

I am currently using Ubuntu 10.04 using the P35-DS3R raid as described above. I have 2 drives in RAID 0 config and ubuntu installed on another IDE drive. I don't care about seeing the raid, but I am being affected by the HPA problem. It always seems to unlock one of my RAID drives which causes it to be seen as member offline on my next warm reboot. A cold reboot fixes the problem. I tried the method above to create libata-options and use the update methods, I have tried setting ignore_hpa=1 and 0 but the system still unlocks the drive. Am I missing something? I don't know how to add libata.ignore_hpa to the kernel command line and can't seem to find good documentation.

Jun 23 17:28:18 peter-desktop kernel: [ 1.336070] ata3.00: HPA unlocked: 312579695 -> 312581808, native 312581808
Jun 23 17:28:18 peter-desktop kernel: [ 1.336074] ata3.00: ATA-7: ST3160815AS, 4.AAA, max UDMA/133
Jun 23 17:28:18 peter-desktop kernel: [ 1.336077] ata3.00: 312581808 sectors, multi 0: LBA48 NCQ (depth 31/32)
Jun 23 17:28:18 peter-desktop kernel: [ 1.369000] ata3.00: configured for UDMA/133

AS I already stated before and again, look for the HDAT2 utility, can be put
on a booteable CD (dos based)

Configure your hard disk as "IDE" (for the moment) boot with a booteable cd
and run HDAT2 (I think hdat2 comes with a booteable cd now)

Run HDAT2, and disable HPA from both disks.

Your problem with raid will be resolved forever, trust me. BTW, this
information will be written in the Hard Disk's firmware so it does not reset
with a power off.

On Wed, Jun 23, 2010 at 1:15 PM, magnum696 <email address hidden> wrote:

> Hello,
>
> I am currently using Ubuntu 10.04 using the P35-DS3R raid as described
> above. I have 2 drives in RAID 0 config and ubuntu installed on another
> IDE drive. I don't care about seeing the raid, but I am being affected
> by the HPA problem. It always seems to unlock one of my RAID drives
> which causes it to be seen as member offline on my next warm reboot. A
> cold reboot fixes the problem. I tried the method above to create
> libata-options and use the update methods, I have tried setting
> ignore_hpa=1 and 0 but the system still unlocks the drive. Am I missing
> something? I don't know how to add libata.ignore_hpa to the kernel
> command line and can't seem to find good documentation.
>
> Jun 23 17:28:18 peter-desktop kernel: [ 1.336070] ata3.00: HPA unlocked:
> 312579695 -> 312581808, native 312581808
> Jun 23 17:28:18 peter-desktop kernel: [ 1.336074] ata3.00: ATA-7:
> ST3160815AS, 4.AAA, max UDMA/133
> Jun 23 17:28:18 peter-desktop kernel: [ 1.336077] ata3.00: 312581808
> sectors, multi 0: LBA48 NCQ (depth 31/32)
> Jun 23 17:28:18 peter-desktop kernel: [ 1.369000] ata3.00: configured
> for UDMA/133
>
> --
> HPA ( Host Protected Area ) interferes with dmraid
> https://bugs.launchpad.net/bugs/219393
> You received this bug notification because you are a direct subscriber
> of a duplicate bug.
>

NeCod (necro-cod) wrote :

I have tried EagleDM solution with HDAT2, and it doesn't works for me.
If i disabled HPA, i cannot boot to RAID partition's: first member offline.

My motheboard is GA-P35-DS4.

EagleDM (eagle-maximopc) wrote :

Wait.
If you actually disable HPA you will lost your " current" RAID configuration
since the bios actually STORES RAID0 array information in the HPA area.

What you HAVE to do to get rid of this nightmare is to backup the entire
array, disassemble the raid setup, go to HDAT2, REMOVE HPA, create the
RAID0 again (this time however, if you CREATE the RAID WITH the HPA disabled
permanently, it will no longer create the metadata in the same place so you
will never have this HPA problem agan, with ubuntu... ever, trust me, is
painful but as long as you have the metadata inside the HPA area, nightmare
is guaranteed.

On Mon, Jul 5, 2010 at 1:29 PM, NeCod <email address hidden> wrote:

> I have tried EagleDM solution with HDAT2, and it doesn't works for me.
> If i disabled HPA, i cannot boot to RAID partition's: first member offline.
>
> My motheboard is GA-P35-DS4.
>
> --
> HPA ( Host Protected Area ) interferes with dmraid
> https://bugs.launchpad.net/bugs/219393
> You received this bug notification because you are a direct subscriber
> of a duplicate bug.
>

EagleDM (eagle-maximopc) wrote :

You have the " member offline" when you disable HPA because the metadata
for the RAID0 is actually being stored on the HPA of that disk, when you
disable it, you lost the configuration, that's why it won't work until you
actully remove HPA and start a fresh RAID0, once you do that, metadata no
longer will be stored on HPA and since HPA is non-existant you problems with
RAID will be gone forever.

On Tue, Jul 6, 2010 at 2:42 AM, Eagle maximopc <email address hidden>wrote:

> Wait.
> If you actually disable HPA you will lost your " current" RAID
> configuration since the bios actually STORES RAID0 array information in the
> HPA area.
>
> What you HAVE to do to get rid of this nightmare is to backup the entire
> array, disassemble the raid setup, go to HDAT2, REMOVE HPA, create the
> RAID0 again (this time however, if you CREATE the RAID WITH the HPA disabled
> permanently, it will no longer create the metadata in the same place so you
> will never have this HPA problem agan, with ubuntu... ever, trust me, is
> painful but as long as you have the metadata inside the HPA area, nightmare
> is guaranteed.
>
>
> On Mon, Jul 5, 2010 at 1:29 PM, NeCod <email address hidden> wrote:
>
>> I have tried EagleDM solution with HDAT2, and it doesn't works for me.
>> If i disabled HPA, i cannot boot to RAID partition's: first member
>> offline.
>>
>> My motheboard is GA-P35-DS4.
>>
>> --
>> HPA ( Host Protected Area ) interferes with dmraid
>> https://bugs.launchpad.net/bugs/219393
>> You received this bug notification because you are a direct subscriber
>> of a duplicate bug.
>>
>
>

NeCod (necro-cod) wrote :

Thanks EagleDM, i will try it.

Maybe I can avoid reinstalling everything, if i follow this post
http://forums.extremeoverclocking.com/showpost.php?p=3329132&postcount=6

jbfoley (jbfoley) wrote :

NeCod, do be careful not to let the mobo re-write that HPA. From my (very bad) experience with that Gigabyte board, a reset of the CMOS, a BIOS flash, a bad stick of RAM, or just playing around with the BIOS backup tools in the wrong way can cause it to do this without warning you. (the driver CD that came with my board was bootable and would do this.) If you have an option in BIOS to disable the backup BIOS to HDD feature, (I didn't, but have seen later boards that did) definitely disable it before using HDAT2, so that you won't have to go through this twice.

If that testdisk util doesn't work, look up R-studio. It's commercial software (and expensive,) but might be able to recover the data on your array if you're really hurting for it. You should still have an external storage device to which to dump the data, but this might save you a lot of reinstalls. Either way, watch for UUIDs to change when you rebuild. Check /etc/fstab and replace disk-by-uuid references to the array before trying to boot to a recovered drive.

Also, HDAT2 is now included on the Ultimate Boot CD, which is an extremely useful resource in its own right, and may contain other utilities to help you recover your array once HDAT2 is run.

http://www.ultimatebootcd.com/

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers