Do NOT disable HPA by default -> leads to data loss

Bug #380138 reported by Kano on 2009-05-25
54
This bug affects 8 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Andy Whitcroft

Bug Description

Binary package hint: linux-image-2.6.30-6-generic

Unlocking HPA for everybody is bad because:

a) BIOS -> Bootloader can not access data behind the HPA -> will lead to bootloader problems when the Linux partition with the kernel/bootloader is behind the unlocked area.

b) In most cases HPA problems are caused by the user itself, There are jumpers on the hd, which can be used to activate HPA. That is not needed in most cases, usually a 32 GB limit called. Unskilled users place the jumpers mirror inverted and that often leads to that problem. Just tell the user to check jumpers.

c) Virtual BIOS solutions - like GigaByte motherboards use

This is absolutely CRITICAL. When the hd is unlocked and fully used new GigaByte boards tend to write 1.5 MB at the end of IDE or SATA drives in legacy mode as BIOS backup. Then HPA is activated, every OS has to respect this otherwise it will definitely overwrite data. That's usally not directly visable for the user, but it definitely happens. Depending of the filesystem used for the partition using that last mb will immediately kill data or it will take time till it is filled, but corruption is inevitable.

d) Fake RAID boot problems

Usually when one or more hd activated hpa because of c) or other reasons are used to create a RAID will fail after booting because the hpa unlock status is presistent after reboot. The only solution is to completely powerdown the system. That happens with my raid setup which could be accessed using dmraid without problems as long as HPA is NOT unlocked. I when you look at dmraid errors with HPA then you will read about users with GigaByte boards.

Even if you do not write anything onto the hd, next reboot fails just because a Ubuntu live image was booted Can you image what a person would think when you see RAID failed on next boot? I am sure it is nothing positive.

-> Leave the kernel default for libata and do NOT disable HPA. For the minority of users which are not in case 1 to 4 and which can not use a tool to disable HPA tell em the kernel option how to deactivate it.

Btw. if you wonder why there are systems out there with a 128 GB HPA limit, that's a Win2000 install restriction. If your hd would be bigger then you will get data corruption. Those users can use software tools to remove the HPA easyly if they do not want to install that old system again.

Kano (master-kanotix) on 2009-05-29
security vulnerability: yes → no
visibility: private → public
Andy Whitcroft (apw) wrote :

This report is referring to the commit below:

  commit abc55974c570b037c04d33b32f269cd9e9f11bee
  Author: Scott James Remnant <email address hidden>
  Date: Tue Mar 3 14:20:01 2009 +0000

    UBUNTU: SAUCE: libata: Ignore HPA by default.

    This was previously changed by using an "options" line in a modprobe.d
    file, however that practice is now deprecated. This is because module
    names, option names, their values and even their current defaults can
    all change inside the kernel and module-init-tools has never been kept
    in sync.

    In addition, changing the kernel means that the option change will apply
    if the module is built in by users or the OEM team.

    Signed-off-by: Scott James Remnant <email address hidden>
    Signed-off-by: Tim Gardner <email address hidden>

Andy Whitcroft (apw) on 2009-06-03
Changed in linux (Ubuntu):
assignee: nobody → Andy Whitcroft (apw)
status: New → In Progress
importance: Undecided → Medium
Andy Whitcroft (apw) wrote :

The patch in question ended up in the kernel as part of the move of all configuration options out of modprobe.d/options to better support a portable udev/kernel configuration. In discussions with those who remeber the history the main issue is that probabally this option should never have been enabled. That it has been enabled in the past makes it hard to turn off without breaking those users who took advantage of it being enabled in the first place. To turn this off we need a transitional plan to prevent us from breaking those affected users.

For the kernel side it looks like we need a third mode for the ignore_hpa option which leaves HPA enabled in the default case. But disables HPA if the partition table on this disk would require it. This would likely trigger when we have partitions which straddle the HPA boundary, and probably only for partition types we are capable of mounting.

The first step here will be to try and determine just how bad the penetration of this issue is. Enabling some direct report when the HPA is present and indicating whether we would have needed to break it for this disk. With a view for targetted testing with this patch.

EagleDM (eagle-maximopc) wrote :

HPA ignore by default is a DISGRACE decision, basically what it does, is if you are already using DMRAID for MatrixRAID, NVRAID, etc, etc, it simply destroys the hpa information on boot, and this cannot be fixed.

After tedious testing I found out that, once a single boot is made with HPA ignoring on the kernel, the HPA information is totally destroyed, the only way the RAID can be used is by booting into Ubuntu every time you start the PC THEN going to Windows te recover the RAID, in Intel Matrix RAID, once you boot up to a linux kernel in HPA ignoring state you never get the HPA information back and the PC will always fail with "Offline Member" even after powering down and up, I already tried this numerous times.. so, One little error on your part of not setting libata ignore_hpa=0 on kernel and that's it, you have to rebuild your entire array for it to be working again without you having to boot into linux every time you want to put Online the disks again.

This is a disaster, I've been reporting this problem, which could be considered a SERIOUS BUG for us, users with RAID and to this day, there is no solution, even on Karmic.

A simple "Boot into RAID system" in the welcome screen at boot could fix this horrendous problem, at least give us a choice without users having to resort to put kernel commands.

Before jaunty it was move back from kernel to libata module in modprobe.d, from jaunty and back on, it was move back to kernel libata ignore, this situation cannot be further delayed, with each change, it means that the user has to be aware Ubuntu internal desicions prior to booting a LiveCD or else, deactivate it's raid system and be sure the option is introduced correctly BEFORE booting the livecd, checking the option once the liveCD is loaded, rebooting with the option enabled and then installing, this hussle should NOT be there for the majority of us, RAID users, which are ever increasing in the desktop space.

Could my post be considered? Did you realize the level in which this situation has escalated and that could be so easily fixed with a simple "kernel option" at boot ?

Could anyone tell me how to do it myself and propose it back to the ubuntu guys for consideration?

I do not want to suffer anymore from this, and I'm prepared to help in any way possible.

PLEASE DO SOMETHING about it :(

Phillip Susi (psusi) wrote :

The HPA is not removed from the drive, just temporarily disabled, so it will return after a hard power cycle. This does have the undesired result however, of breaking windows if you warm boot. You would think that the reboot would reset the drive, but alas...

EagleDM (eagle-maximopc) wrote :

I do not agree.

If you use the program HDAT2, boot from that and DISABLE HPA, the program will disable it on the Firmware Level.

At least in my case, with a Velociraptor RAID0 Array, both HPA's on both disks were permanently disabled, I no longer have problems with Ubuntu since I disabled it.

On 4/7/2010 12:14 PM, EagleDM wrote:
> I do not agree.
>
> If you use the program HDAT2, boot from that and DISABLE HPA, the
> program will disable it on the Firmware Level.
>
> At least in my case, with a Velociraptor RAID0 Array, both HPA's on both
> disks were permanently disabled, I no longer have problems with Ubuntu
> since I disabled it.

Yes, and what does HDAT2 have to do with libata? It has an option to
make the change permanent, libata does not.

Kano (master-kanotix) wrote :

You don't need that HDAT2 tool, hdparm can set HPA permanently (or disable it when set to full size). By default it is a temp. change like what is done with that disabling now, but

hdparm -N /dev/sdX

shows HPA

hdparm -N xxx /dev/sdX

sets HPA to xxx temp. or

hdparm -N xxxp /dev/sdX

for permanent change. Just set it to max value shown in the first step.

But look at my comment b) which is critical as soon as somebody boots the system with a resetted gigabyte bios which runs hds in ide mode not ahci. Then hpa will be added again. When you do that with raid drives you have to recreate the raid, if you are lucky you can find with gpart/testdisk the partitions again and you only need to reinstall a bootloader, but do nothing without a backup.

Yes, you're right and HPA disabled in firmware is useful as long as you
don't move the HDD to another machine, but I supposed that as long as you
stay things the same, this solution is far simpler than messing up with
libata.

Just my 2 cents.

On Wed, Apr 7, 2010 at 2:08 PM, Phillip Susi <email address hidden> wrote:

> On 4/7/2010 12:14 PM, EagleDM wrote:
> > I do not agree.
> >
> > If you use the program HDAT2, boot from that and DISABLE HPA, the
> > program will disable it on the Firmware Level.
> >
> > At least in my case, with a Velociraptor RAID0 Array, both HPA's on both
> > disks were permanently disabled, I no longer have problems with Ubuntu
> > since I disabled it.
>
> Yes, and what does HDAT2 have to do with libata? It has an option to
> make the change permanent, libata does not.
>
> --
> Do NOT disable HPA by default -> leads to data loss
> https://bugs.launchpad.net/bugs/380138
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Phillip Susi (psusi) wrote :

So Andy, it has been some time now, how is this coming? Think we can stop diverging from upstream on this with maverick?

NeCod (necro-cod) wrote :

I have next configuration:

Gigabyte P35-DS4 Bios F14c
2 x Seagate 320 GB RAID 0 Intel Matrix Storage:
    Dual Boot XP + Windows 7
1 x Seagate 160 GB
    Ubuntu 10.04

When I boot to Ubuntu and reboot, first member Raid fail (it says Non-Raid Disk) and i cannot boot to Windows partitions.
I need to power off computer for the first member appears.

By HDAT2 i have disabled HPA, same problem occurs: first member Raid fail (it says Non-Raid Disk) and i cannot boot to Windows partitions. Obviosly, Intel Matrix Storage need to HPA.
I have reactivated HPA to recover RAID.

hdparm command don't show HPA :

$ sudo hdparm -N /dev/sda
/dev/sda:
 max sectors = 625142448/4385456(625142448?), HPA setting seems invalid (buggy kernel device driver?)

but kern.log shows it:

$ grep -i HPA /var/log/kern.log
Jul 4 12:58:42 gubi kernel: [ 1.373053] ata3.00: HPA unlocked: 625140335 -> 625142448, native 625142448

What to do ?

I prefer to keep the RAID, but i need Ubuntu.

WiLLiTo (victor-gg83) wrote :

All I done was...

1- Disable backup bios to disk in gigabyte bios if enabled

2- Disable HPA with HDAT2 (disk must be in sata mode not raid)

3- Back to raid mode and recreate raid

5- Particioning raid with other OS. (Lucid has a bug particioning raid disk, I used fedora 13 live usb)

6- Now install ubuntu (I had less problems installing karmic and upgrading before than doing a clean lucid install)

I wish that works for you too

regards

Phillip Susi (psusi) wrote :

It appears that upstream has implemented a change where the HPA is automatically unlocked if needed to access partitions described in the partition table, rendering the original patch obsolete. Can we please get this dropped now?

Kjow (antispammoni) wrote :

With Ubuntu 9.04 (and vicinity), to solve the problem of raid corruption, I was simply adding to the boot string (F6):

libata.ignore_hpa=0

With Ubuntu 10.04 (and now with 10.10) this parameter doesn't work anymore and every time I restart the PC from Ubuntu, I always have the raid corrupted; so I have to power off the PC to bring back all right.

A O (aodrzywolek) wrote :

Any known workaround for 10.04? This is so annoying. I have P35-DS4 with two HDD in RAID1, with ALL important data on it. Forunately, Windows 7 is installed on separate physical SSD disk, but previously many system files (TEMP, hiberfile, pagefile) were on RAID. It caused numerous surprising errors, before I realized that RAID is no longer there...

Phillip Susi (psusi) wrote :

sudo bash -c 'echo options libata ignore_hpa=0 > /etc/modprobe.d/libata.conf' should do the trick.

Andy Whitcroft (apw) wrote :

It seems that the desired functionality has hit mainline (v2.6.35 and later), that HPA will start enabled but will be unlocked should the partitions require it:

  commit d8d9129ea28e2177749627c82962feb26e8d11e9
  Author: Tejun Heo <email address hidden>
  Date: Sat May 15 20:09:34 2010 +0200

    libata: implement on-demand HPA unlocking

This should mean we can drop the patch below:

  commit eaea8ccbd3adf0b36942ae834eaa825094772c95
  Author: Scott James Remnant <email address hidden>
  Date: Tue Mar 3 14:20:01 2009 +0000

    UBUNTU: SAUCE: (no-up) libata: Ignore HPA by default.

Propose we drop this in Natty and see what happens. A call for testing will likely be appropriate.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 2.6.37-12.26

---------------
linux (2.6.37-12.26) natty; urgency=low

  [ Andy Whitcroft ]

  * rebase to v2.6.37-rc8
  * [Config] armel -- reenable omap flavour
  * [Config] disable CONFIG_MACH_OMAP3517EVM to fix FTBS on armel omap
  * [Config] disable CONFIG_GPIO_VX855 to fix FTBS on omap armel
  * [Config] disable CONFIG_WESTBRIDGE_ASTORIA to fix FTBS on omap armel
  * [Config] disable CONFIG_TI_DAVINCI_EMAC to fix FTBS on omap armel
  * rebase to mainline 989d873fc5b6a96695b97738dea8d9f02a60f8ab
  * [Config] track missing modules
  * rebase to v2.6.37 final

  [ Chase Douglas ]

  * SAUCE: (drop after 2.6.37) HID: magicmouse: Don't report REL_{X, Y} for
    Magic Trackpad

  [ Stefan Bader ]

  * Revert "SAUCE: blkfront: default to sd devices"
    - LP: #684875

  [ Tim Gardner ]

  * Revert "SAUCE: (no-up) libata: Ignore HPA by default."
    - LP: #380138
  * [Config] Added autofs4.ko to -virtual flavour
    - LP: #692917

  [ Upstream Kernel Changes ]

  * Add support for Intellimouse Mode in ALPS touchpad on Dell E2 series
    Laptops
    - LP: #632884

  [ Upstream Kernel Changes ]

  * rebase to v2.6.37-rc8
  * rebase to mainline 989d873fc5b6a96695b97738dea8d9f02a60f8ab
  * rebase to v2.6.37 final
 -- Andy Whitcroft <email address hidden> Thu, 23 Dec 2010 18:34:13 +0000

Changed in linux (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers