hdparm's IDENTIFY DEVICE command breaks firewire devices

Bug #548513 reported by Jerone Young on 2010-03-26
62
This bug affects 13 people
Affects Status Importance Assigned to Milestone
OEM Priority Project
Critical
Unassigned
hdparm (Ubuntu)
Undecided
Unassigned
Lucid
Undecided
Unassigned
linux (Ubuntu)
Medium
Unassigned
Lucid
Medium
Unassigned

Bug Description

Firewire disks harddisk are broken under 10.04. I have tried attaching firewire harddisk to several different firewire controllers and it does not work.

Sometime it will show up in kernel messages and you will see the partitions in /proc/partitions. But when you try to access the disk in anyway it fails.

This is a requirement for Ubuntu Certification for machines with Firewire and needs to be fixed by release.

Jerone Young (jerone) wrote :
Jerone Young (jerone) wrote :
Changed in oem-priority:
status: New → In Progress
Jerone Young (jerone) on 2010-03-26
Changed in oem-priority:
status: In Progress → New
importance: Undecided → Critical
Changed in linux (Ubuntu):
importance: Undecided → Critical
status: New → Confirmed
tags: added: regression-potential
Jeremy Kerr (jk-ozlabs) wrote :

Looks like this may not be all controllers/disks; seems to work here with a generic SBP2 disk interface. I've tested with both a OHCI and PCILynx firewire controller.

Jerone - could you try with a different FW disk, and see if the problem still occurs? May help to isolate the root cause.

Jerone Young (jerone) wrote :

@Jeremy
      I'll have others give it a try with there firewire disks. I only have Western Digital My Book 1.5gb disk. Which is the most common consumer case today.

      Works fine under 9.10 on the same machine(s). Under 10.04 no go.

tags: added: kernel-series-unknown
Jeremy Kerr (jk-ozlabs) wrote :

Just a stab in the dark, but can you try this with the disk plugged in to an external power source (if you're not doing that already?) it looks like the FW bus is being reset, and so the SCSI layer is creating multiple devices when the requests time out.

Brian Murray (brian-murray) wrote :

For what its worth my firewire iPod works fine with the following controller:

05:02.0 FireWire (IEEE 1394): Texas Instruments TSB12LV26 IEEE-1394 Controller (Link)

dmesg output:

[90320.260054] ieee1394: The root node is not cycle master capable; selecting a new root node and resetting...
[90320.581099] ieee1394: Node changed: 0-00:1023 -> 0-01:1023
[90329.010092] ieee1394: Node changed: 0-01:1023 -> 0-00:1023
[90329.648654] ohci1394: fw-host0: SelfID received, but NodeID invalid (probably new bus reset occurred): 0800FFC0
[90330.880082] ieee1394: Error parsing configrom for node 0-00:1023
[90330.880155] ieee1394: Node changed: 0-00:1023 -> 0-01:1023
[90336.890615] ieee1394: Node added: ID:BUS[0-00:1023] GUID[000a270002558f57]
[90336.893321] scsi7 : SBP-2 IEEE-1394
[90336.893430] ieee1394: sbp2: Workarounds for node 0-00:1023: 0x9 (firmware_revision 0x0a2700, vendor_id 0x000a27, model_id 0x000000)
[90338.015424] ieee1394: sbp2: Logged into SBP-2 device
[90338.015484] ieee1394: sbp2: Node 0-00:1023: Max speed [S400] - Max payload [2048]
[90338.017806] scsi 7:0:0:0: Direct-Access-RBC Apple iPod 1.51 PQ: 0 ANSI: 2
[90338.018107] sd 7:0:0:0: Attached scsi generic sg8 type 14
[90338.442552] sd 7:0:0:0: [sdg] Adjusting the sector count from its reported value: 39063024
[90338.442561] sd 7:0:0:0: [sdg] 39063023 512-byte logical blocks: (20.0 GB/18.6 GiB)
[90338.444521] sd 7:0:0:0: [sdg] Test WP failed, assume Write Enabled
[90338.446321] sd 7:0:0:0: [sdg] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[90338.449551] sd 7:0:0:0: [sdg] Adjusting the sector count from its reported value: 39063024
[90338.451926] sd 7:0:0:0: [sdg] Test WP failed, assume Write Enabled
[90338.453762] sdg: sdg1 sdg2
[90338.471450] sd 7:0:0:0: [sdg] Adjusting the sector count from its reported value: 39063024
[90338.473488] sd 7:0:0:0: [sdg] Test WP failed, assume Write Enabled
[90338.475289] sd 7:0:0:0: [sdg] Attached SCSI removable disk

summary: - Firewire disks not working under 10.04
+ Some firewire disks not working under 10.04

@Jeremy
             The drive requires an external power source, so tried that case. Also played around with powering it up and down.

@Brian
            Actually are you able to mount and write to the disk? I saw this too with one system .. but when trying to access the disk it was actually unable to do it.

Chris Wayne (cwayne18) wrote :

I've seen this issue with a Ricoh e832 firewire card. Same behavior (can see the disk in /proc/partitions though gparted cant see it and it can't be mounted). Attaching dmesg

Chris Wayne (cwayne18) wrote :

@Jerone
Unfortunately, I think the firewire disk we have here is the same one you have, so I'd expect the results to be similar

Brian Murray (brian-murray) wrote :

@Jerone - I was able to write to the disk.

Manoj Iyer (manjo) wrote :

@jerone can you please try the kernel in http://people.ubuntu.com/~manjo/lp548513-lucid/ and report back here?

Changed in linux (Ubuntu Lucid):
status: Confirmed → Incomplete
Jerone Young (jerone) wrote :

@Manoj

            Tried it out. Did not work. I've attached the dmesg.

             Still same issue can see the device .. but when something tries to access it it can't.

             Manoj also if you need a machine or device call me or email me. Can meet up and can let you borrow a device to try and get to the problem faster.

Changed in linux (Ubuntu Lucid):
status: Incomplete → Confirmed
Manoj Iyer (manjo) wrote :

Looks like we are using the older firewire stack, I the older stack is reported not to work as well with the newer kernel (2.6.32).

https://ieee1394.wiki.kernel.org/index.php/Release_Notes#Linux_2.6.32 ((scroll up for 2.6.33 notes).

I tried blacklising the old stack and using the new stack but I cant get to see the disks anymore.

I will continue to look at this some more.

Jerone Young (jerone) on 2010-04-01
Changed in oem-priority:
status: New → In Progress
Jerone Young (jerone) wrote :

@Manoj
            I have a theory after some other partically related issues with a USB 3.0 drive found in a disscussion on the udev mailing list. Can you remove /lib/udev/85-hdparm.rules & reboot and see it works.

           It may be hdparm at fault?

Jeremy Teale (jteale) wrote :

sudo mv /lib/udev/rules.d/85-hdparm.rules /lib/udev/rules.d/85-hdparm.disabled fixes this issue for me. I am using the new firewire subsystem.

jeremy@hephaestus:~$ lsmod | grep firewire
firewire_sbp2 15009 2
firewire_ohci 25343 0
firewire_core 51537 2 firewire_sbp2,firewire_ohci
crc_itu_t 1715 1 firewire_core

Jeremy Teale (jteale) wrote :

Please see bug 515023

Jerone Young (jerone) wrote :

@Jeremy
               Ok looks to be an issue with "hdparm" in user space. Can you test on a clean install of Lucid with the kernel currently shipping.

Matthias Klose (doko) wrote :

Renaming the file mentioned in #15 works for me as well,
05:00.1 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 IEEE 1394 Controller (rev 04)

Jerone Young (jerone) wrote :

The issue here is the "hdparm" command is launched by udev. It then attempts to try to activate ata pass through. This then causes the drive to fail.
bug 515023

But disabling hdparm from being lauched by udev gets around the issue. Then the drive works fine.

I tested the latest hdparm (9.28) and still see it trying to do ata pass through.

Jerone Young (jerone) wrote :

I've found, what I think, is a good solution to the problem for now.

The hdparm udev rule that is launching hdparm. What hdparm is tyring to do is enable ATA passthrough for drives that claim to support it (most newer drives do). The problem is there doesn't seem to be proper kernel support or many drives really don't support it.

What I have done is found a way to identify firewire & usb drives and tell the udev rule to not try and run hdparm to activate ATA passthrough.

So changing /lib/udev/rules.d/85-hdparm to:
ACTION=="add", SUBSYSTEM=="block", KERNEL=="[sh]d[a-z]", \
 ENV{ID_PATH}!="pci-*-ieee1394-*|pci-*-usb-*", \
 RUN+="/lib/udev/hdparm"

* This fix does expose a separate issue in udev for firewire drives. Udev tries to create a second symbolic link causing a kernel warning from sysfs. The drive works though. Will get this 2nd issue resolved also.

Also here is a clip from the udev db of how the drive is exposed in udev:
P: /devices/pci0000:00/0000:00:1c.4/0000:0d:00.3/fw-host0/0090a9236e69e657/0090a9236e69e657-0/host6/target6:0:0/6:0:0:0/block/sdb
N: sdb
W: 53
S: block/8:16
S: disk/by-id/ieee1394-0090a9236e69e657:0:0
S: disk/by-path/pci-0000:0d:00.3-ieee1394-0x0090a9236e69e657:0:0
E: UDEV_LOG=3
E: DEVPATH=/devices/pci0000:00/0000:00:1c.4/0000:0d:00.3/fw-host0/0090a9236e69e657/0090a9236e69e657-0/host6/target6:0:0/6:0:0:0/block/sdb
E: MAJOR=8
E: MINOR=16
E: DEVNAME=/dev/sdb
E: DEVTYPE=disk
E: SUBSYSTEM=block
E: ID_SCSI=1
E: ID_VENDOR=WD
E: ID_VENDOR_ENC=WD\x20\x20\x20\x20\x20\x20
E: ID_MODEL=My_Book
E: ID_MODEL_ENC=My\x20Book\x20\x20\x20\x20\x20\x20\x20\x20\x20
E: ID_REVISION=1028
E: ID_TYPE=disk
E: ID_BUS=scsi
E: ID_PATH=pci-0000:0d:00.3-ieee1394-0x0090a9236e69e657:0:0
E: ID_PART_TABLE_TYPE=dos
E: UDISKS_PRESENTATION_NOPOLICY=0
E: UDISKS_PARTITION_TABLE=1
E: UDISKS_PARTITION_TABLE_SCHEME=mbr
E: UDISKS_PARTITION_TABLE_COUNT=1
E: DEVLINKS=/dev/block/8:16 /dev/disk/by-id/ieee1394-0090a9236e69e657:0:0 /dev/disk/by-path/pci-0000:0d:00.3-ieee1394-0x0090a9236e69e657:0:0

Changed in hdparm (Ubuntu Lucid):
status: New → Confirmed
Jerone Young (jerone) wrote :

Here is a patch for the hdparm package.

Colin Watson (cjwatson) wrote :

Yes, this is probably an appropriate workaround (assuming it works, but it does look plausible) - but hdparm is (potentially) doing other things as well as enabling ATA passthrough, so I'd like to allow the active upstream discussion of this problem to proceed a bit longer before we slam a workaround into place. If there is no alternative by 10.04 final, then we can apply this.

tags: added: patch
Klaus Doblmann (moviemaniac) wrote :

Confirming the workaround by Jerone to work, thanks!

Jerone Young (jerone) wrote :

Just to let everyone know. The comment I made about a 2nd issue with udev was invalid. Had something in my other install.

Though on my new install apport is giving a false error about the kernel. Though no errors reported in dmesg and the firewire drive is working fine (with the workaround fix).

Huygens (huygens-25) wrote :

I had reported back in Bug #543488 that with Lucid, it was not possible to mount some firewire hard disks either with the old or new stack. I had made the test with Lucid Beta 1.

Today, I'm updating the status with Lucid Beta 2:
It is working out-of-the-box (without patch or other configuration), however Lucid Beta 2 is still using the old firewire stack by default.

So I have tried to switch to the new kernel stack (just as i tried it on Karmic Bug #529524). However, this is still not working on Lucid Beta 2. See my report on the Bug #529524.

N.B.: My firewire external HDD is a western digital MyBook with external power supply. I'm plugging it to a Lenovo Thinkpad T500 using a 6-pin to 4-pin firewire cable (the T500 has only a 4-pin plug).

Huygens (huygens-25) wrote :

Another test with Lucid Beta 2...
I have migrated to the new firewire kernel stack and I have applied the above mentionned patch. In this case, I can use the firewire HDD.

So here is the whole summary:
 1. with Lucid Beta 2, firewire HDD is using the old kernel stack and is working out-of-the-box
 2. if one wants to use the new stack, then one has to apply the above mention patch as well as following the Juju Migration Guide (http://ieee1394.wiki.kernel.org/index.php/Juju_Migration)

Jeff (jdorenbush) wrote :

I too was experiencing this problem on Ubuntu 10.4 with my Western Digital 500GB My Book external hard drive connected via firewire.

I can confirm that after manually patching the file my external drive is recognized by Ubuntu. (Note: installing Jerone's patch via command line would not work for me)

Matt Zimmerman (mdz) on 2010-04-13
Changed in linux (Ubuntu Lucid):
status: Confirmed → Triaged
Andy Whitcroft (apw) on 2010-04-13
tags: added: lucid
removed: kernel-series-unknown

Bug 562903 has been marked a duplicate of this bug and is resolved by the work around patch I posted.

Not sure why hdparm is run. But it causes many usb & firewire devices that have already been setup fine by the kernel to get screwed up.

summary: - Some firewire disks not working under 10.04
+ Some firewire & usb disks not working under 10.04

So this is failing for me too, the device is being offlined.

It would help to know what hdparm is actually doing, this is on 10.4 on my T510:

<10:30:50>udev$ diff -u hdparm.orig hdparm
--- hdparm.orig 2010-04-14 10:26:37.233760588 -0400
+++ hdparm 2010-04-14 10:27:40.262576808 -0400
@@ -27,6 +27,7 @@

 OPTIONS=$(hdparm_options $DEVNAME)
 if [ -n "$OPTIONS" ]; then
+ /usr/bin/logger -p info "Executing hdparm with options $OPTIONS on $DEVNAME"
  /sbin/hdparm $OPTIONS $DEVNAME 2>/dev/null
 fi

<10:27:40>udev$ tail -f /var/log/messages | grep hdparm

Apr 14 10:28:08 nautilus logger: Executing hdparm with options -B254 on /dev/sdc

According to the hdparm manpage, -B configures power management:

"... The highest degree of power management is attained with a setting of 1, and the highest I/O performance with a setting of 254. A value of 255 tells hdparm to disable Advanced Power Management altogether on the drive (not all drives support disabling it, but most do)"

This value is being determined by /lib/hdparm/hdparm-functions ...

    egrep -v '^[[:space:]]*(#|$)' /etc/hdparm.conf |
    {
        # set our default global apm policy here.
        if hdparm_is_on_battery; then
            hdparm_set_option -B128
        else
            hdparm_set_option -B254
        fi

So you might get different results depending on whether your laptop is on ac power or not.

It looks like we could completely work around this by adjusting the defaults in /etc/hdparm.conf.

We'll see how this behavior is different from 9.10.

So 9.10 doesn't do any of this APM configuration, actually hdparm does nothing at all in this instance.
When I configure /etc/hdparm.conf to use the same value found in 10.4 it does *not* fail.

It's a little naughty to set a global policy in a functions file on 10.4. I can only override the new
global default, I can't completely disable it from even attempting the operation. A close substitute
to skipping apm config is to edit /etc/hdparm.conf and set "apm = off" which is an illegal value
and will cause hdparm to exit harmlessly before it does anything to the drive. With this workaround
I'm able to mount and use the WD My Book without any issues.

This doesn't seem like a scsi mid-layer issue to me and is likely confined to the firewire storage driver.

@Jerone

Take a closer look at your patch, we don't want to disable hdparm as a whole from running,
we just want to be able to "opt out" of this global APM config, which you can't do without
editing the function file /lib/hdparm/hdparm-functions. A good compromise would be
to add smarts to the config file parser which allows one to disable the global apm policy.
It could be as simple as setting apm = skip and then dropping it completely from the argument
list. hdparm is a very necessarily utility for things like disabling the cache and other things.

Your approach would leave the above average user scratching his head because they would
make changes in /etc/hdparm.conf and it just wouldn't happen ever.

Jerone Young (jerone) wrote :

@Peter
           The issue is we do not know if apm is the sole cause of all the issues. We would need testing from users firewire & usb drives effected. There is also the issue of that hdparm is trying to make drives do ATA Pass through and they are failing with that also ... See this bug

https://bugs.launchpad.net/bugs/515023

@Jerone

Actually, we do. If you examine how /lib/udev/hdparm operates you'll find that it
does nothing unless the config file tells it to except this global apm policy which
was embedded in /lib/hdparm/hdparm-functions. The default hdparm.conf file
does nothing at all except set the 'quiet' flag.

There's nothing evil about ATA passthrough, these *are* ATA drives using
firewire or usb as the transport.

Finally, we don't have to guess at why usb is having problems either, I've
shown how easy it is to instrument the udev scripts. Let's get some more
data from those users.

Jerone Young (jerone) wrote :

@Peter
        The problem is it is breaking these drives. Not that it is evil. The only thing your comment say is to turn off apm. The problem with that is it does it for ALL drives. This works for regular sata & esata drives.

         You still have not found a overall solution for the problems.

Jerone Young (jerone) wrote :

This is a post on the linux-hotplug mailing list that goes deep into issues with usb drive controllers & hdparm. The thread even has the hdparm maintainer .. who has some fixes to make .. though this identifies yet another issue cause by hdparm (it's a really long thread):

http://www.spinics.net/lists/hotplug/msg03547.html

Download full text (5.8 KiB)

@Jerone
 We already discussed this offline but for everyone else's benefit, I was commenting on the feasibility
of the workaround and how to best optimize it. No where did I ever state that this was an acceptable
alternative to seeking the root cause.

So my initial stab at instrumenting this has been interesting, the passthrough commands are nothing
exotic and other ATA passthrough commands are failing, like configuring the cache.
...

This was plain jane "identify device" dropping dead after the drive was up and stable.

Apr 15 17:22:00 ubuntu kernel: [12081.321383] ieee1394: sbp2: aborting sbp2 command
Apr 15 17:22:00 ubuntu kernel: [12081.321389] sd 22:0:0:0: [sdb] CDB: ATA command pass through(16): 85 08 0e 00 00 00 01 00 00 00 00 00 00 00 ec 00

The second byte from the end is the actual ATA command being executed.
IDENTIFY DEVICE = ECh

http://www.spinics.net/lists/hotplug/msg03552.html

Alan makes a good point regarding the ordering of the commands executed. I can reproduce
something similar here using sg_sat_set_features

sg_sat_set_features -f 82 /dev/sdb

which is trying to disable the cache on the drive.

The logs are showing a SCSI inquiry timing out, which should be translated to an identify device,
I never get to the actual "set features" command.

I'm working on some more targeted instrumentation to see how the ordering affects things. It'd
be great if I had an SAS/SATA analyzer here.

Thing is, if this where a general SATL issue we'd be seeing it everywhere, sg_sat_identify works fine
for the host drive. This firewire stuff must be implementing it's own version. Sigh...

Reading further into their analysis, it's this specific USB to SATA bridge chip that's having problems.
We might have the same bridge in our firewire drive. These external drives have USB, eSATA, and FW for
connection options. I wouldn't be surprised if they're translating everything to one protocol before sending
it to the drive. I'd have to take it apart to find out which bridge it's using.

Mark Lord's observation is proving correct so far
http://www.spinics.net/lists/hotplug/msg03610.html

Issuing a 12 byte IDENTIFY DEVICE to the drive is proving successful.

root@ubuntu:~# lsscsi
[0:0:0:0] disk ATA Hitachi HTS54502 PB2O /dev/sda
[1:0:0:0] cd/dvd Optiarc DVD+-RW AD-7585H KD03 /dev/sr0
[28:0:0:0] disk WD My Book 1028 /dev/sdb
[29:0:1:0] enclosu WD My Book Device -

root@ubuntu:~# sg_sat_identify --len 12 /dev/sdb
Response for IDENTIFY DEVICE ATA command:
 00 427a 3fff c837 0010 0000 0000 003f 0000 Bz ?. .7 .. .. .. .? ..
 08 0000 0000 2020 2020 2057 442d 5743 4154 .. .. W D- WC AT
 10 3134 3139 3230 3639 0000 4000 0032 3031 14 19 20 69 .. @. .2 01
 18 2e30 3341 3031 5744 4320 5744 3332 3030 .0 3A 01 WD C WD 32 00
 20 4141 4a53 2d30 3042 3441 3020 2020 2020 AA JS -0 0B 4A 0
 28 2020 2020 2020 2020 2020 2020 2020 8010 ..
 30 0000 2f00 4001 0000 0000 0007 3fff 0010 .. /. @. .. .. .. ?. ..
 38 003f fc10 00fb 0101 ffff 0fff 0000 0007 .? .. .. .. .. .. .. ..
 40 0003 0078 0078 0078 0078 ...

Read more...

Colin Watson (cjwatson) wrote :

Pursuant to Peter's comments, does the attached patch do the job? It should disable the default APM policy for these devices when running from udev, without interfering with other uses of hdparm.conf or manual runs of hdparm.

(I realise that this probably doesn't fix everything, but at this point I would prefer a minimal fix that addresses problems found on default installations.)

Jerone Young (jerone) wrote :

@Colin
      This would only solve the problem for firewire drives(?) or at lest just Western digital. Also as mentioned APM is not the only issue .. just one of many. I would suggest not running hdparm at all for firewire drives & usb drives .. as my patch to udev does this ensures full compatibility.

      Since the kernel is setting everything up correctly .. then hdparm comes in and messes these drives up. Until the root cause for all the issues can be solved it's best to not run hdparm on removable drives.

       Peter is doing a great job getting closer to the root causes.

@Colin

That's a good compromise, it documents the issue, and it works well. I thought
 I had found a glimmer of hope in the hdparm manpage.

       --prefer-ata12
              When using the SAT (SCSI ATA Translation) protocol, hdparm nor‐
              mally prefers to use the 16-byte command format whenever possi‐
              ble. But some USB drive enclosures don't work correctly with
              16-byte commands. This flag can be used to force use of the
              smaller 12-byte command format with such drives. hdparm will
              still revert to 16-byte commands for things that cannot be done
              with the 12-byte format (eg. sector accesses beyond 28-bits).

But it doesn't live up to it's claims, what we need is "force-ata12" not "prefer".

The next version of hdparm has yet to be officially released. We should
also consider adding a kernel quirk for this drive wrt 16 byte ID cmds.

Thanks for testing. I'll go ahead and upload this, then, pending future
hdparm/kernel improvements which are presumably unlikely to make 10.04
LTS at this point.

Colin Watson (cjwatson) wrote :

On Fri, Apr 16, 2010 at 02:11:27PM -0000, Jerone Young wrote:
> This would only solve the problem for firewire drives(?) or at
> lest just Western digital.

Why just Western Digital? There's nothing WD-specific here.

> Also as mentioned APM is not the only issue .. just one of many.

As also mentioned, APM is the only one that's configured on by default
(for other excellent reasons - there was an EXTREMELY long bug about
this a while back). I think it's not unreasonable to assume that
explicit user configuration should override our WAGs?

> Peter is doing a great job getting closer to the root causes.

Peter seems to agree with my approach, so I'm going to go with this for
now. This does not at all preclude further root-cause work, of course.

@Colin
      Not sure if you saw the duplicate bugs to this bug. But also Black Berry phones & other USB drives are effected by hdparm. Users have reported the work around I gave fixed it for them. These seem unrealated to APM.

Martin Pitt (pitti) wrote :

For the record, this was just discussed in #u-release:

pitti cjwatson: it looks like you just remove the -B option, while Jeroen's change disables hdparm altogether (hdparm-functions has other bits); but I believe -B is the only thing we actually ship by default
cjwatson that was my assessment from reading the code
cjwatson double-checking wouldn't hurt
pitti cjwatson: it seems to be that the bug doesn't happen on that particular operation, but on the initial IDENTIFY DEVICE
pitti (incidentally that's very similar to the recent udisks-probe-ata-smart bug)
pitti cjwatson: so from what I can see, if an user would customize /etc/hdparm.conf to set other options than -B, they would again be done for firefire drives, too, and again break them?
pitti (conversely, with Jeroen's variant you couldn't ever run custom hdparm commands on any firewire drive)
cjwatson I think so
cjwatson though hdparm.conf can be disk-specific
pitti sounds like being between a rock and a hard place :(
pitti ah, it can?
pitti ok, then I prefer your fix indeed :)
pitti ah, right, these come in /dev/foo { } blocks
cjwatson such is my understanding
cjwatson a release note mightn't hurt

I don't think that one of the two approaches is "obviously" right, but I think Colin's approach is more flexible for actually being able to control some fw drives with hdparm (with customized configuration), and it will stop the corruption on a default installation.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package hdparm - 9.15-1ubuntu8

---------------
hdparm (9.15-1ubuntu8) lucid; urgency=low

  * Don't apply default APM policy to Firewire or USB devices when running
    from udev (LP: #515023, #548513).
 -- Colin Watson <email address hidden> Fri, 16 Apr 2010 16:19:29 +0100

Changed in hdparm (Ubuntu Lucid):
status: Confirmed → Fix Released
Martin Pitt (pitti) wrote :

With the workaround in place, the linux task isn't release critical to lucid any more (and not realistic in the slightest either).

Still keeping it open, since the hdparm fix is just a workaround.

Changed in linux (Ubuntu Lucid):
importance: Critical → Medium
status: Triaged → Won't Fix
summary: - Some firewire & usb disks not working under 10.04
+ hdparm's IDENTIFY DEVICE command breaks firewire devices

On Fri, Apr 16, 2010 at 03:56:02PM -0000, Jerone Young wrote:
> Not sure if you saw the duplicate bugs to this bug. But also Black
> Berry phones & other USB drives are effected by hdparm. Users have
> reported the work around I gave fixed it for them.

My workaround checks the same device types as yours does.

> These seem unrealated to APM.

I think we're talking at cross-purposes here. The point is not that APM
is the thing that breaks, but rather that APM is the only thing that is
enabled by default in Ubuntu's hdparm. If we cause the default APM
policy not to be applied for these device types, then hdparm will not be
run for those device types by default because there'll be no options to
pass to it.

The reason I think it's better to do it this way is that it means that a
user who has some specific reason to configure hdparm in a particular
way, for a device type that happens to be covered by this udev rule but
doesn't actually break, has a way to do so that isn't completely
obscure.

Does that make more sense now?

Jerone Young (jerone) wrote :

@Colin
                That makes since.

Jerone Young (jerone) on 2010-04-18
Changed in oem-priority:
status: In Progress → Fix Released
No_OnE (no-one) wrote :

Still having the same problem with a WD MyBook 1TB/FireWire(/Single disk).

I'm running Kubuntu Lucid i686, 2.6.32-22-generic #33-Ubuntu SMP Wed Apr 28 13:28:05 UTC 2010 x86_64 GNU/Linux.

sudo dpkg -l | grep hdparm tells me the fix should be in place:
ii hdparm 9.15-1ubuntu9 tune hard disk parameters for high performan

I've tried blacklisting either one FW stack at a time, moving /lib/udev/rules.d/85-hdparm.rules, changing them according to
https://bugs.launchpad.net/oem-priority/+bug/548513/comments/20 and every combination of the forementioned "solutions".

Hardwarewise I'm running on an Asus P6T Deluxe V2, Intel i7 920, Kingston DDR3 1333 / 5* 2gb, ECC, Nvidia GTX 260.

FireWire was fine under Karmic i686, 2.6.32-21-generic until last night when I decided to do a clean install of lucid.

Please advise.

No_OnE (no-one) wrote :

lsscsi seems to find the drive though:

The first (WD) drive is a WD MyBook World 2TB configured as RAID1.
The second one, I suppose, is the WD MyBook 1TB FW_only_, which is a single disk model, fully encrypted at the partition level.

I'll probably dismantle the device and attach the driver to internal sata in order to avoid falling back to Karmic, which I "ran over" with the clean install.

sudo lsscsi
[1:0:0:0] cd/dvd Optiarc DVD RW AD-5240S 1.03 /dev/sr0
[1:0:1:0] disk ATA WDC WD5000AAKS-0 12.0 /dev/sda
[6:0:0:0] disk WD My Book 1012 /dev/sdb
[6:0:0:1] enclosu WD My Book Device 1012 -

@No_OnE

I'm hoping there isn't some packaging issue between kubuntu
and ubuntu. I have the same version of hdparm as you quoted running
lucid/ubuntu and have verified that the workaround is indeed here.

$ grep hdparm_try_apm /lib/hdparm/hdparm-functions
hdparm_try_apm()
         if hdparm_try_apm "$WANTED_DISK"; then

If it's not returning one, or it's not there, that would explain
why you're seeing the hang. Could you please verify that these
bits are indeed there and if so, please apply the following
modifications and retest so we can see what's going on.

# diff -u /lib/hdparm/hdparm-functions.orig /lib/hdparm/hdparm-functions
--- /lib/hdparm/hdparm-functions.orig 2010-05-13 09:12:43.034523157 -0400
+++ /lib/hdparm/hdparm-functions 2010-05-13 09:16:21.603306128 -0400
@@ -49,13 +49,17 @@
  {
      # set our default global apm policy here.
      if [ -z "$ID_PATH" ]; then
+ logger -p info -t XXX "device id_path ${ID_PATH}"
          local ID_PATH="$(udevadm info -n "$1" -q property 2>/dev/null | sed -n 's/^ID_PATH=//p')" || true
      fi
      case $ID_PATH in
          pci-*-ieee1394-*|pci-*-usb-*)
+ logger -p info -t XXX "hdparm_try_apm returning 1"
              return 1
              ;;
      esac
+
+ logger -p info -t XXX "hdparm_try_apm returning 0"
      return 0
  }

Just grep for XXX in /var/log/messages and post the results. Thanks.

Peter

Discovery isn't an issue, it's this particular firewire bridge
that gets lobotomized when it receives the larger ident device cmd,
because it fails to implement the spec correctly.

If you run lsscsi -l you'll find that the block device state is
offline.

No_OnE (no-one) wrote :

I'll try compiling hdparm with those changes shortly. Just an observation, put the disk in a more flexible (connection wise) WD external controller, used eSATA which should by what I've read from bugs.lauchpad.net work out of the box. Well, it does not.
I'm beginning to suspect that the full disk encryption on the disk has something to do with the lack of discovery.

I'll propably rescue the data next under a different OS and see if the disk can be made to work with the patch above. And next time, I'll use a encrypted container like I've done with the other WD (which is a 2TB Mirror, even though I've a 2TB World aswell.)

PS. Funny, WD does not list the exact model of the drive I'm having problems with anymore. Neither do they list the one I moved into...

bg (boterog) wrote :

This work for me:

The Jerone's patch was the soluction for my eSATA external hard disk.

!!!!
So changing /lib/udev/rules.d/85-hdparm.rules to:
ACTION=="add", SUBSYSTEM=="block", KERNEL=="[sh]d[a-z]", \
 ENV{ID_PATH}!="pci-*-ieee1394-*|pci-*-usb-*", \
 RUN+="/lib/udev/hdparm"
!!!

but the problem continue with my USB external disk
I have found the solution for this problem by putting the parameter acpi=off on the end of the kernel options.

/boot/grub/grub.cfg

menuentry "Ubuntu 10.04 LTS (10.04) (on /dev/sdb5)" {
 insmod ext2
 set root='(hd1,5)'
 search --no-floppy --fs-uuid --set de5a18c4-1a61-4f0b-aab0-941623a0cb01
 linux /boot/vmlinuz-2.6.32-22-generic-pae root=/dev/sdb5 ro quiet splash acpi=off
 initrd /boot/initrd.img-2.6.32-22-generic-pae
}

https://bugs.launchpad.net/ubuntu/+bug/575646/comments/6

[0:0:0:0] disk WDC WD16 00JS-22MHB0 1C03 /dev/sdc
[1:0:0:0] disk ATA WDC WD3200BEKT-6 12.0 /dev/sda
[2:0:0:0] cd/dvd hp CDDVDW TS-L633M 0301 /dev/sr0
[6:0:0:0] disk ATA ST3500320AS SD15 /dev/sdb

uglyjunkmail (uglyjunkmail) wrote :

I've never signed into a site like this before. I guess I'm desperate. I can not control my professional DVCAM DV camera the Sony DSR-PDX10p using the firewire connection. I am using a TOSHIBA Satellite Pro computer (I know but it works well). I know it is not the PC or the camera because I booted to Windows XP (PC is dual boot to Lucid Lynx 10.04 and Windows) and the camera works perfectly under Windows. I like Ubuntu a lot and want to leave Windows. All I do is edit film and animation and Ubuntu has been great for this. Great design and I am more creative with the super written software. I hope someone can solve the bug and cheer me up. Thx.

Jeremy Foshee (jeremyfoshee) wrote :

I'm interested to know if this is still an issue I see that there have been several fixes for this, but I wonder at Peter Petrakis' last comment that there may be something needed for this particular device. Let me know so we can triage and work this appropriately. If you could test against latest Natty, that would be helpful.

Thanks!

~JFo

tags: removed: regression-potential
Brad Figg (brad-figg) wrote :

Are people still seeing this issue with Oneiric or later installs?

Changed in linux (Ubuntu):
status: Triaged → Incomplete
Tim Gardner (timg-tpi) on 2012-10-02
Changed in linux (Ubuntu):
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers