Ubuntu

Hard disk I/O randomly freezes when hald is running and optical drive is empty

Reported by Keenan Pepper on 2007-02-11
164
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Linux
Invalid
High
linux (Ubuntu)
Undecided
Unassigned
linux-source-2.6.20 (BOSS)
Undecided
Unassigned
linux-source-2.6.20 (Ubuntu)
Medium
Unassigned
linux-source-2.6.22 (Ubuntu)
Medium
Unassigned

Bug Description

Binary package hint: hal

When I upgraded the kernel on my System76 Gazelle (basically a ASUS Z62FP without the Microsoft tax) from 2.6.17 to 2.6.20, the hard disk began freezing for 30 seconds every few minutes whenever the CD/DVD drive was empty. When there is a disk in the optical drive, the freezes occur much less often, but I'm sure there's been at least one even with a CD in.

The relevant part of the dmesg is:

[ 188.960000] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[ 188.960000] ata1.01: cmd a0/00:00:00:00:20/00:00:00:00:00/b0 tag 0 cdb 0x0 data 0
[ 188.960000] res 40/00:03:00:00:00/00:00:00:00:00/b0 Emask 0x4 (timeout)
[ 195.964000] ata1: port is slow to respond, please be patient (Status 0xd0)
[ 218.980000] ata1: port failed to respond (30 secs, Status 0xd0)
[ 218.980000] ata1: soft resetting port
[ 219.332000] ata1.00: configured for UDMA/100
[ 219.516000] ata1.01: configured for UDMA/33
[ 219.516000] ata1: EH complete
[ 219.532000] SCSI device sda: 78140160 512-byte hdwr sectors (40008 MB)
[ 219.540000] sda: Write Protect is off
[ 219.540000] sda: Mode Sense: 00 3a 00 00
[ 219.900000] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 219.904000] SCSI device sda: 78140160 512-byte hdwr sectors (40008 MB)
[ 219.904000] sda: Write Protect is off
[ 219.904000] sda: Mode Sense: 00 3a 00 00
[ 219.908000] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA

I'm filing this bug against HAL because kernel developer Tejun Heo says HAL is poking the CD/DVD drive and confusing it, and indeed when I kill hald the problem goes away. On the other hand, I only noticed the problem after upgrading the kernel, and when I force the old ide_generic driver to be used by blacklisting ata_piix, the problem also goes away, so maybe it should be filed against the kernel package instead.

dmesg and lspci -vvx attached

Keenan Pepper (keenanpepper) wrote :
Keenan Pepper (keenanpepper) wrote :
Keenan Pepper (keenanpepper) wrote :

I tried running hald in verbose mode with "hald --daemon=yes --verbose=yes --use-syslog", but no messages from hald appear in /var/log/syslog at the same time as the freeze. There are plenty of messages when hald starts up, but then it's silent when the freeze actually happens.

Keenan Pepper (keenanpepper) wrote :

Here's the output of "lshal" though, that might be useful.

Hi,
there's a process hald-addon-storage which, surprise, polls removable media. You can try to kill the one associated with your CD/DVD drive. But as you said it's surely a kernel bug and hald only triggers it.

Kevin P (kevin-cybercolloids) wrote :

I have seen a similar issue with my up to date feisty installation. AMD64 and two SATA drives. the problem appears to be with a new 250GB Maxtor. An older 80GB Samsung Spinpoint seems OK.

uname = Linux kryton 2.6.20-5-generic #2 SMP Sat Jan 6 09:44:32 UTC 2007 x86_64 GNU/Linux

dmesg = [ 1410.976979] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1419.402905] ata1.00: limiting speed to PIO0
[ 1419.402910] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[ 1419.402917] ata1.00: cmd 20/00:68:86:20:6f/00:00:00:00:00/e2 tag 0 cdb 0x0 data 53248 in
[ 1419.402919] res 50/01:01:01:00:00/01:00:00:00:00/00 Emask 0x202 (HSM violation)

I also get
[ 236.863319] ata1.00: limiting speed to UDMA/100
[ 236.863323] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[ 236.863580] ata1.00: cmd ca/00:08:76:3a:b0/00:00:00:00:00/e3 tag 0 cdb 0x0 data 4096 out
[ 236.863582] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 236.863909] ata1: soft resetting port
[ 237.035354] ata1.00: configured for UDMA/100

Florian Schmid (annaeus) wrote :

Some here on feisty after upgrading to latest Kernel 2.6.20-13-generic:

[ 5249.792000] SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
[ 5249.792000] sda: Write Protect is off
[ 5249.792000] sda: Mode Sense: 00 3a 00 00
[ 5249.792000] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 5519.372000] ata1.01: qc timeout (cmd 0xa0)
[ 5519.372000] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[ 5519.372000] ata1.01: cmd a0/00:00:00:00:20/00:00:00:00:00/b0 tag 0 cdb 0x0 data 0
[ 5519.372000] res 51/20:03:00:00:00/00:00:00:00:00/b0 Emask 0x5 (timeout)
[ 5526.372000] ata1: port is slow to respond, please be patient (Status 0xd1)
[ 5549.388000] ata1: port failed to respond (30 secs, Status 0xd1)
[ 5549.388000] ata1: soft resetting port
[ 5549.740000] ata1.00: configured for UDMA/33
[ 5549.920000] ata1.01: configured for UDMA/33
[ 5549.920000] ata1: EH complete

My problem began after a Feisty update as well.

Try booting into recovery mode and running e2fsck on each partition.
Then add the kernel options acpi=off pci=bios That seems to have got my
system working again. I did some tests last night and could boot with
acpi=off but without the option I had problems. I have also noticed some
hard disk corruption now as well. The same disk worked with no problems
when I booted using an old knoppix disk. So I conclude there is a
problem somewhere between the libata driver and the hardware.

Florian Schmid wrote:
> Some here on feisty after upgrading to latest Kernel 2.6.20-13-generic:
>
> [ 5249.792000] SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
> [ 5249.792000] sda: Write Protect is off
> [ 5249.792000] sda: Mode Sense: 00 3a 00 00
> [ 5249.792000] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [ 5519.372000] ata1.01: qc timeout (cmd 0xa0)
> [ 5519.372000] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> [ 5519.372000] ata1.01: cmd a0/00:00:00:00:20/00:00:00:00:00/b0 tag 0 cdb 0x0 data 0
> [ 5519.372000] res 51/20:03:00:00:00/00:00:00:00:00/b0 Emask 0x5 (timeout)
> [ 5526.372000] ata1: port is slow to respond, please be patient (Status 0xd1)
> [ 5549.388000] ata1: port failed to respond (30 secs, Status 0xd1)
> [ 5549.388000] ata1: soft resetting port
> [ 5549.740000] ata1.00: configured for UDMA/33
> [ 5549.920000] ata1.01: configured for UDMA/33
> [ 5549.920000] ata1: EH complete
>
>

Kevin P (kevin-cybercolloids) wrote :

Ran some tests last night:

1. Adding acpi=off as a boot parameter appears to make the system more stable. I tried various bios and kernel parameter options but the best combination was leaving the bios in all default settings and adding the acpi=off to the kernel. This could be a red herring as the problem seems to be intermittent anyway.

2. I only get problems with a 250GB Maxtor maxline iii drive. An 80Gb Samsung Spinpoint works with no issues.

3. Booting the system from an old Knoppix disk I got no issues. The issue appears to lie with how recent kernels interact with the maxline disk.

4. I am seeing some corruption and had to e2fsck my root partition.

[ 1565.181909] ata4.00: speed down requested but no transfer mode left
[ 1565.181914] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[ 1565.181920] ata4.00: cmd 20/00:80:46:a4:76/00:00:00:00:00/e2 tag 0 cdb 0x0 data 65536 in
[ 1565.181922] res 50/01:01:01:00:00/01:00:00:00:00/00 Emask 0x202 (HSM violation)
[ 1565.181932] ata4: soft resetting port
[ 1565.266976] ATA: abnormal status 0x7F on port 0xD007
[ 1565.272358] ATA: abnormal status 0x7F on port 0xD007
[ 1565.285001] ata4.00: configured for PIO0
[ 1565.285008] ata4: EH complete
[ 1565.306696] SCSI device sda: 490234752 512-byte hdwr sectors (251000 MB)
[ 1565.307302] sda: Write Protect is off
[ 1565.307304] sda: Mode Sense: 00 3a 00 00
[ 1565.308249] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Joey Stanford (joey) wrote :

confirming this bug. It's happening on my laptop as well. I passed this to Kyle.

Changed in linux-source-2.6.20:
status: Unconfirmed → Confirmed
Ben Collins (ben-collins) wrote :

Please let the devs confirm bugs, or read the policies at https://wiki.ubuntu.com/KernelTeamBugPolicies for how to correctly handle kernel bugs.

Thanks

Changed in linux-source-2.6.20:
assignee: nobody → ubuntu-kernel-team
importance: Undecided → Medium
Marc Tardif (cr3) wrote :

I am experiencing the same problem on a System76 Darter Z35F running Feisty beta installed from the alternate CD. I am attaching information from the machine below.

Marc Tardif (cr3) wrote :
Marc Tardif (cr3) wrote :
Marc Tardif (cr3) wrote :
Kevin P (kevin-cybercolloids) wrote :

After the previous editors comment about optical disks I have run my machine with a CD in both CD drives. However I triggered another 30s halt by plugging in a USB camera. Here is the dmseg.

[ 123.506818] usb 4-1: new full speed USB device using uhci_hcd and address 2
[ 123.691727] usb 4-1: configuration #1 chosen from 1 choice
[ 141.849608] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[ 141.849642] ata1.00: cmd c8/00:10:fe:1b:76/00:00:00:00:00/e2 tag 0 cdb 0x0 data 8192 in
[ 141.849644] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 145.349321] ata1: port is slow to respond, please be patient (Status 0xd0)
[ 156.865767] ata1: port failed to respond (30 secs, Status 0xd0)
[ 156.865797] ata1: soft resetting port
[ 156.949132] ATA: abnormal status 0x7F on port 0x000000000001d007
[ 156.954547] ATA: abnormal status 0x7F on port 0x000000000001d007
[ 156.966548] ata1.00: configured for UDMA/133
[ 156.966555] ata1: EH complete
[ 156.983921] SCSI device sda: 490234752 512-byte hdwr sectors (251000 MB)
[ 156.983974] sda: Write Protect is off
[ 156.983976] sda: Mode Sense: 00 3a 00 00
[ 156.985481] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Curtis Hovey (sinzui) wrote :

Well I can see I'm in good company here. My system76 Pangolin began throwing 'exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen' in the log, followed by a 'soft resetting port' message a few weeks ago. This last upgrade appears to have made the problem worse

Curtis Hovey (sinzui) wrote :
Curtis Hovey (sinzui) wrote :
Curtis Hovey (sinzui) wrote :

Linux autumn.annrky-sinzui.local 2.6.20-13-generic #2 SMP Sun Mar 25 00:21:25 UTC 2007 i686 GNU/Linux

Kevin P (kevin-cybercolloids) wrote :

Some notes from my testing.

DMESG after a reboot - The system locks 6 times, here are the lock ups plus what immediately precedes them.

[ 42.595211] input: USB HID v1.00 Mouse [Microsoft Microsoft Wheel Mouse Optical®] on usb-0000:00:10.1-2
[ 42.595225] usbcore: registered new interface driver usbhid
[ 42.595228] drivers/usb/input/hid-core.c: v2.6:USB HID core driver
[ 68.032891] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen

[ 106.221210] skge eth0: enabling interface
[ 107.888688] skge eth0: Link is up at 100 Mbps, half duplex, flow control none
[ 136.306699] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen

[ 170.722289] lp0: using parport0 (polling).
[ 170.763665] ieee1394: Initialized config rom entry `ip1394'
[ 201.006966] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen

[ 201.260754] EXT3 FS on sda3, internal journal
[ 231.478370] ata1.00: limiting speed to UDMA/100:PIO4
[ 231.478375] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen

[10889.413852] ISO 9660 Extensions: Microsoft Joliet Level 3
[10889.415648] ISO 9660 Extensions: RRIP_1991A
[10917.763403] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen

[10947.987814] SCSI device sda: 490234752 512-byte hdwr sectors (251000 MB)
[10948.003947] sda: Write Protect is off
[10948.003950] sda: Mode Sense: 00 3a 00 00
[10948.024838] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[10983.654841] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen

Alan Ferrier (alan-ferrier) wrote :

Workaround is to kill the hald-addon-storage process

Scott Henson (scotth) wrote :

I'd like to add a nice little me too to this bug. I have a system76 gazelle as well. I can say that 2.6.20-11-generic worked fine, while -12 and -13 exhibit the behavior above. Though I believe -11 uses the older non-libata driver. For now I have -11 set to be my primary kernel and I boot into the latest whenever I see an update to check if its fixed.

I also tried the cd in drive thing and it seemed to work for me. Once I took the cd out I got a freeze within a few minutes.

Another me too on a Fujitsu-Siemens Amilo M7440G
Dmesg:
[ 2817.976000] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[ 2817.976000] ata3.00: cmd c8/00:80:d7:58:44/00:00:00:00:00/e8 tag 0 cdb 0x0 data 65536 in
[ 2817.976000] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 2824.976000] ata3: port is slow to respond, please be patient (Status 0xd0)
[ 2847.992000] ata3: port failed to respond (30 secs, Status 0xd0)
[ 2847.992000] ata3: soft resetting port
[ 2848.172000] ata3.00: configured for UDMA/100
[ 2848.172000] ata3: EH complete
[ 2848.604000] SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
[ 2848.604000] sda: Write Protect is off
[ 2848.604000] sda: Mode Sense: 00 3a 00 00
[ 2848.604000] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Alan Ferrier (alan-ferrier) wrote :

It would appear that this bug still exists in kernel 2.6.20-14-generic

Apr 5 18:56:35 localhost kernel: [ 2880.464000] ata1.01: qc timeout (cmd 0xa0)
Apr 5 18:56:35 localhost kernel: [ 2880.464000] res 51/20:03:00:00:00/00:00:00:00:00/b0 Emask 0x5 (timeout)
Apr 5 18:56:42 localhost kernel: [ 2887.468000] ata1: port is slow to respond, please be patient (Status 0xd0)
Apr 5 18:57:05 localhost kernel: [ 2910.484000] ata1: soft resetting port
Apr 5 18:57:06 localhost kernel: [ 2910.844000] ata1.00: configured for UDMA/100
Apr 5 18:57:06 localhost kernel: [ 2911.024000] ata1.01: configured for UDMA/33
Apr 5 18:57:06 localhost kernel: [ 2911.024000] ata1: EH complete
Apr 5 18:57:06 localhost kernel: [ 2911.032000] SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
Apr 5 18:57:06 localhost kernel: [ 2911.044000] sda: Write Protect is off
Apr 5 18:57:06 localhost kernel: [ 2911.064000] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Apr 5 18:57:06 localhost kernel: [ 2911.084000] SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
Apr 5 18:57:06 localhost kernel: [ 2911.084000] sda: Write Protect is off
Apr 5 18:57:06 localhost kernel: [ 2911.088000] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Scott Henson (scotth) wrote :

2.6.20-15-generic exhibits the same errors

[ 283.180000] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[ 283.180000] ata1.01: cmd a0/00:00:00:00:20/00:00:00:00:00/b0 tag 0 cdb 0x1e data 0
[ 283.180000] res 40/00:03:00:00:00/00:00:00:00:00/b0 Emask 0x4 (timeout)
[ 290.184000] ata1: port is slow to respond, please be patient (Status 0xd0)
[ 313.200000] ata1: port failed to respond (30 secs, Status 0xd0)
[ 313.200000] ata1: soft resetting port
[ 313.544000] ata1.00: ata_hpa_resize 1: sectors = 78140160, hpa_sectors = 78140160
[ 313.552000] ata1.00: ata_hpa_resize 1: sectors = 78140160, hpa_sectors = 78140160
[ 313.552000] ata1.00: configured for UDMA/100
[ 313.732000] ata1.01: configured for UDMA/33
[ 313.732000] ata1: EH complete
[ 313.740000] SCSI device sda: 78140160 512-byte hdwr sectors (40008 MB)
[ 313.740000] sda: Write Protect is off
[ 313.740000] sda: Mode Sense: 00 3a 00 00
[ 314.156000] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 314.160000] SCSI device sda: 78140160 512-byte hdwr sectors (40008 MB)
[ 314.160000] sda: Write Protect is off
[ 314.160000] sda: Mode Sense: 00 3a 00 00
[ 314.160000] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Marc Tardif (cr3) wrote :

The same problem still occurs after installing 20070415.

Marc Tardif (cr3) wrote :

Here are steps to workaround the problem during the installation process:

  1. Add break=top to the kernel cmdline
  2. At the prompt, run: modprobe piix
  3. Then exit

And, here steps to make the workaround permanent after an installation:

  echo blacklist ata_piix | sudo tee -a /etc/modprobe.d/blacklist-ata
  echo piix | sudo tee -a /etc/initramfs-tools/modules
  sudo update-initramfs -u
  sudo reboot

The problem should be fixed for the first SRU release of Feisty.

tryed Marc's workaround, but now I can't boot...

Carl Richell (carlrichell) wrote :

VERY IMPORTANT: The workaround at the bottom of the report is only for NEW INSTALLATIONS. If the work around is used on an existing install your system will not boot!

OK... I'm very interested in this workaround, but I don't understand when to do what...

What does it mean "during the installation process"? Do I have to boot from the live cd? and follow the first three steps?

And what about the remaining four steps? when do I have to follow them?

Thanks for the patience

Keenan Pepper (keenanpepper) wrote :

Carl, I did pretty much the same thing on my existing installation and it works fine. It's just using the old piix driver instead of the new ata-piix driver. What's supposed to be the problem?

holycow (mik-mars) wrote :

confirmed, i posted in the wrong bug report.

same issue, kernel 2.6.20-15-generic on an asus z96f laptop.

i won't be trying the workarounds, i'll wait for a fix.

thx for the heads up on this bug.

Goldenear (goldenear) wrote :

Same bug here on my asus a8jc. (feisty kernel 2.6.20-15.27)

[220073.844000] SCSI device sda: 195371568 512-byte hdwr sectors (100030 MB)
[220073.844000] sda: Write Protect is off
[220073.844000] sda: Mode Sense: 00 3a 00 00
[220073.848000] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[220643.280000] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[220643.280000] ata1.01: cmd a0/01:00:00:00:00/00:00:00:00:00/b0 tag 0 cdb 0x25 data 8 in
[220643.280000] res 40/00:03:00:00:00/00:00:00:00:00/b0 Emask 0x4 (timeout)
[220650.284000] ata1: port is slow to respond, please be patient (Status 0xd0)
[220673.300000] ata1: port failed to respond (30 secs, Status 0xd0)
[220673.300000] ata1: soft resetting port
[220673.648000] ata1.00: ata_hpa_resize 1: sectors = 195371568, hpa_sectors = 195371568
[220673.656000] ata1.00: ata_hpa_resize 1: sectors = 195371568, hpa_sectors = 195371568
[220673.660000] ata1.00: configured for UDMA/100
[220673.844000] ata1.01: configured for UDMA/25
[220673.844000] ata1: EH complete
[220673.848000] SCSI device sda: 195371568 512-byte hdwr sectors (100030 MB)
[220673.848000] sda: Write Protect is off
[220673.848000] sda: Mode Sense: 00 3a 00 00
[220673.848000] SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Carl Richell (carlrichell) wrote :

More of a precaution than anything else. We wanted to test across more machines before customers started using the fix (we had mixed reports). We're putting together a .deb to automatically apply the fix and will push it down through our repo.

Scott Henson (scotth) wrote :

I'm curious where this deb is. I consider myself to be a reasonably clueful user/developer and Id love to provide some testing of the deb if at all possible.

holycow (mik-mars) wrote :

same here.

how do i access the deb to test this out?

kshitiz (kshitiz-saxena) wrote :

I am also facing same issue on connoi laptop. Hopefully we will get the patch soon.

Changed in linux:
status: Unknown → Rejected
Changed in linux:
status: Unknown → Confirmed
Changed in linux:
status: Confirmed → In Progress
Changed in linux:
status: In Progress → Incomplete
Changed in linux-source-2.6.20:
status: Confirmed → Triaged
Changed in linux:
status: Incomplete → Confirmed
Changed in linux-source-2.6.22:
assignee: nobody → ubuntu-kernel-team
importance: Undecided → Medium
status: New → Triaged
62 comments hidden view all 142 comments
Papamatti (matti-lx) wrote :

I have the same bug like described in the bugdesciption on a nVidia 590 based mainboard (ASUS M2N32 WS Professional).
There are two SATA Samsung 250GB harddisks connected and one Samsung SH-182M dvd-writer at the IDE-port of the board.
With this issue i cannont create a working software raid.
Can i use the same workaround as described above (blacklist ata_piix) or is this for intel-boards only?

K (kkumar) wrote :

I do have this problem with my new SATA disk. OS is running on ATA disk, so ubuntu runs fine but my mythtv and all other media writes into my SATA and it is keep freezing. My mother board and hard drive as follows
GIGABYTE GA-K8N Pro-SLI 939 NVIDIA nForce4 SLI ATX AMD Motherboard with 1394b
Western Digital SE16 500GB SATA2 7200RPM 16MB Cache 8.9MS NCQ Hard Drive OEM

Papamatti (matti-lx) wrote :

I've solved the bug for me, one of the sata cables was bad - i've changed all cables to original asus cables (wich delivered with my ASUS-Board)
Didn't believe that a cable can cause so much trouble...now it works for about two weeks 24 hours a day without an error.

etrustco:
"I truly hope there's less people affected by these libata issues than it seems, or 7.10 will be a fiasco :-(
...
It is also unfortunate that Ubuntu removed (most of?) the old IDE modules/drivers, thus impeding the option to simply blacklist the (corresponding) libata driver.."

I can confirm this. I'm not even using cutting edge hardware. Ubuntu 5.10 has fewer disk issues than 7.04 and 7.10. 7.10 is the worst. I get these freeze-up events, and after a lot of Googling and reading tried to force the system back to the old ide-driver, just to realize that the piix module isn't part of Gusty any more, and ide-generic doesn't give me DMA :-(

K (kkumar) wrote :

It is really pain for me,

I have two disks, first one is ATA , 10GB, loaded with OS, running fine.

Second one is SATA, 500GB, this is for all my files, media etc. This disk
keep freezing and it is really annoying when I try to download some file
from internet or my mythtv automatic recording, watching tv etc.

I will be glad if some one tells me how to restore drive back online without
rebooting my machine. dmesg logs attached.

On Nov 15, 2007 9:44 PM, Wolfram Arnold < <email address hidden>> wrote:

> etrustco:
> "I truly hope there's less people affected by these libata issues than it
> seems, or 7.10 will be a fiasco :-(
> ...
> It is also unfortunate that Ubuntu removed (most of?) the old IDE
> modules/drivers, thus impeding the option to simply blacklist the
> (corresponding) libata driver.."
>
> I can confirm this. I'm not even using cutting edge hardware. Ubuntu
> 5.10 has fewer disk issues than 7.04 and 7.10. 7.10 is the worst. I
> get these freeze-up events, and after a lot of Googling and reading
> tried to force the system back to the old ide-driver, just to realize
> that the piix module isn't part of Gusty any more, and ide-generic
> doesn't give me DMA :-(
>
> --
> Hard disk I/O randomly freezes when hald is running and optical drive is
> empty
> https://bugs.launchpad.net/bugs/84603
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Db0 (db0) wrote :

I can confirm I have something quite similar.

http://ubuntuforums.org/showthread.php?p=3784478

I am going to try playing with the cables a bit and see if I can disaple that piix.

suoko (suoko) wrote :

Problem is still there.
As I posted here http://ubuntuforums.org/showthread.php?t=598580, feisty kernel under gutsy + blacklisting some modules is a way to solve this problem, although this causes problems with usb devices (i.e. I can't mount camera anymore although feisty could do it with no problems) and cpu scaling.
I guess we'd need a new gutsy kernel or a customized one.

suoko (suoko) wrote :
pdm (patrice-sancey) wrote :

Hi,

this solution is for people who have TSS corp DVD driver. I have a Matshita one.

wolfram Arnold wrote that piix module is not in Gutsy : what do you mean : can't we add piix in /etc/initramfs-tools/modules ?

What can we do to make Gutsy possible to (simply) use ?

K (kkumar) wrote :

Does it help me by disconnecting my DVD drive from mother board? I use DVD
drive very very rare.

On Nov 20, 2007 8:41 AM, pdm <email address hidden> wrote:

> Hi,
>
> this solution is for people who have TSS corp DVD driver. I have a
> Matshita one.
>
> wolfram Arnold wrote that piix module is not in Gutsy : what do you mean
> : can't we add piix in /etc/initramfs-tools/modules ?
>
> What can we do to make Gutsy possible to (simply) use ?
>
> --
> Hard disk I/O randomly freezes when hald is running and optical drive is
> empty
> https://bugs.launchpad.net/bugs/84603
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Laurent (laurent-goujon) wrote :

I'm still using the ide_cd module (sata_nv is disabled on Gutsy) and also have the problem so disabling ata-piix or sata modules won't have any effect.

Igor Lautar (igorl) wrote :

I confirm this bug on HP nc8430 with upgraded drive.

Summary:
I just upgraded existing drive (Fujitsu MHV2100BH) to Seagate 7200.2 (ST9200420ASG). Previous drive worked fine with edgy and feisty (did not tried it with gutsy).
After upgrade, I went and installed feisty. The same lockups (as described here) apear on random.

Igor Lautar (igorl) wrote :

Typo in my previous comment:
"After upgrade, I went and installed feisty"
I've actually installed gutsy.

mezhaka (mezhaka) wrote :

I confirm that the problem has gone after updating the firmware of the cd drive. it is an asus laptop (6000 series) with the afore mentioned TSST drive. I used the blah-blahSC04 version of firmware. sfdndos utility mentioned above. tried to make a bootable usb, but could not boot using it, so i burned boot CD from http://pioneerdvd.rpc1.org/index.html#BOOTISO

thanks to all of those who contributed.

Kevin P (kevin-cybercolloids) wrote :

A few more notes from testing. A lot of people talking about CD-ROM drives. I have two optical drives, one Samsung and one NEC.

I disconnected the Samsung drive - same problem.

Connect both optical drives plus Samsung SATA harddrive - problem

Connect both optical drives plus Maxtor SATA harddrive - problem

Disconnect SATA hardrive - no problem

From what I can tell the issue is with the SATA harddrive and not with any of the optical drives. Currently I am using the system everyday with both optical drives and an IDE hard drive with no problems. As soon as I reconnect the SATA hard drive - problems.

The SATA harddrive has been used before in a Ubuntu box with no problems - these issues developed sometime around an upgrade to Feisty. Currently I am using Gutsy with exactly the same problems.

Another point - the piix workaround doesn't seem to be the key to the issue - I have a via board and I get the problem.

Changed in linux-source-2.6.20:
status: Triaged → Won't Fix
Changed in linux-source-2.6.22:
status: Triaged → Won't Fix
Changed in linux-source-2.6.20:
assignee: nobody → phillip-lougher
status: New → Invalid
arjanhs (arjan-advance) wrote :

I'm having the same problem after updating the Gutsy kernel to 2.6.22-14, after a reboot i got the following errors:

ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata3.00: cmd c8/00:20:88:d3:d7/00:00:00:00:00/e1 tag 0 cdb 0x0 data 16384 in
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata3: port is slow to respond, please be patient (Status 0xd0)
ata3: device not ready (errno=-16), forcing hardreset
ata3: soft resetting port
ata3.00: configured for UDMA/133
ata3: EH complete
488397168 512-byte hardware sectors (250059 MB)
Write Protect is off
sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

It's a new drive which i'm using for three months now.

K (kkumar) wrote :

I have same problem with my SATA, it works good in windows. I have
another question. If disk fails in the middle of copying then, I had to
reboot to set it back online. Is there any way to re-attach this disk
without rebooting?

Arjanhs wrote:
> I'm having the same problem after updating the Gutsy kernel to
> 2.6.22-14, after a reboot i got the following errors:
>
> ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata3.00: cmd c8/00:20:88:d3:d7/00:00:00:00:00/e1 tag 0 cdb 0x0 data 16384 in
> res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata3: port is slow to respond, please be patient (Status 0xd0)
> ata3: device not ready (errno=-16), forcing hardreset
> ata3: soft resetting port
> ata3.00: configured for UDMA/133
> ata3: EH complete
> 488397168 512-byte hardware sectors (250059 MB)
> Write Protect is off
> sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>
> It's a new drive which i'm using for three months now.
>
>

--
With Regards,
Kiran Kamsetti

Luka Renko (lure) wrote :

This is also reproducible with latest Hardy version on HP nw8440 with Seagate disk ST980825AS

Luka Renko (lure) wrote :

Maybe we should implement hal-info quirk that would disable polling on such HW, as it was done for Dell laptops - see bug 48499

Luka,

Care to quickly verify the issue exists against 2.6.24-11 and reattach your dmesg as well as lspci -vvnn output? Thanks.

Changed in linux:
status: New → Incomplete
richard philippe (rifi58) wrote :

Ubuntu run normaly only when a cd/dvd is inside...

Why the hell isn't this one year old bug marked as critical ?!?

Kevin P (kevin-cybercolloids) wrote :

I agree this bug seems to be serious - it renders your computer useless. A 30s delay in hard drive access is pretty useless in my books. Also the CD in a CD-Drive trick doesn't seem to work for me. The bug seems to be present for both Maxtor and Samsung hard drives I have tried. I have a small network of Ubuntu desktops and servers in our company and currently have an order for new equipment on hold because of this bug. New motherboards tend to be mainly SATA and I am not confident enough to buy a load of new SATA boards only to find they don't run with Ubuntu.

1 comments hidden view all 142 comments
Luka Renko (lure) wrote :

I have noticed that during this hick-ups, "htop" shows at least one (in most cases both) fully loaded (100%), even though that the top process on the list occupies cca 10%. htop also presents that CPU load as red, so I suspect this is CPU time used in kernel. I suspect kernel is spinning on something in this case.

I use amd64 kernel on HP nw8440 laptop with Seagate 7200 RPM disk (ST980825AS)

Can anybody still experiencing this problem test the following possible solution?
http://linux-ata.org/faq.html#combined
The slow down as a result of the PATA/SATA combination may be causing at least some of the above mentioned problems.
This will probably only affect those with Intel chipsets.

Changed in linux:
assignee: nobody → rifi58
Brian Murray (brian-murray) wrote :

I'm unassigning this bug as bugs should only be assigned when someone is working on a fix for the bug and this doesn't seem to be the case.

Changed in linux:
assignee: rifi58 → nobody
Melekai (mgeuken) wrote :

as of today im experiencing similar issues. im really unsure if this is the right place. but basically when i have a SATA drive plugged in the system wont boot.
previously with just one SATA and main disk as a IDE, i would boot but get random 10-30 seconds freezes in the system.
when the sata drive was unplugged those errors were gone.

when ONLY one sata drive is plugged in. it will randomly boot the live cd and randomly crash. (hardy 64/beta/rc)
when more than one sata drive is plugged in. will never boot. always some kind of error. ... live environment just wont load.
with any other version on ide drivers. this does not happen.

Kevin P (kevin-cybercolloids) wrote :

The problem is solved on my machine - I replaced the SATA cabling with better quality Akasa latch cables instead of the cheap cables that come with the motherboards. So far I have installed and run Ubuntu/Hardy and installed Gentoo (on another partition) and compiled X/Gnome with no problems. The machine has been rebooted several times and has not frozen once.

So I can confirm that one possible cause of the error -

[ 231.478375] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen

Could be poor quality cabling.

Googling around problems with cheap SATA cabling are very common.

Jochen Garcke (jochen-bugs) wrote :

I also had problems like this on my nc6220 laptop starting with gutsy, often when moving the laptop the system hang for a bit and activity in the (empty) cd drive could be heard.

Since upgrading to Hardy this seems to be over, only very short breaks if at all when moving the laptop to access the cd drive.

weks (na18) wrote :

i also have this problem with my asus notebook with ata hdd...pc locks up randomly and here is system log...is there any fix for this?

Changed in linux:
status: Incomplete → Confirmed
kiev1 (sys-sys-admin) wrote :

This kernel bug
 this problem already whole year

for me she showed up one time in the floor of hour, however as a result of this problem I lost a mysql database - mysql innodb not start - "Accertion error" - did not help even "innodb_force_recovery = 4", backup was an a week remoteness - the works of whole department lost data for a few days, the management simply in shock - I going to discharge from job (((

this problem already whole year:
-----------
I'm stumped trying to track down the below intermittent problem.....
I've confirmed this problem on 2.6.19, 2.6.20 and 2.6.21.
http://lkml.org/lkml/2007/6/14/154
http://kerneltrap.org/mailarchive/linux-kernel/2007/6/14/103765
http://kerneltrap.org/node/16175
http://lkml.org/lkml/2007/6/14/154
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/217920
https://bugs.launchpad.net/ubuntu/+bug/164183
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/229747
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/159521
https://bugs.launchpad.net/ubuntu/+bug/164183
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/187146
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/221437
https://bugs.launchpad.net/ubuntu/+bug/226600

SUSE:

ata errors, system freeze
https://bugzilla.novell.com/show_bug.cgi?id=393675

System lockup with concurrent acces to SATA disks on Promise PDC20378
http://lists.opensuse.org/opensuse-bugs/2008-02/msg03458.html

Kernel panic / system hang / sata_promise
https://bugzilla.novell.com/show_bug.cgi?id=350907

DELL Poweredge 2970 hangs sometimes (ata1)
https://bugzilla.novell.com/show_bug.cgi?id=359333

Fedora:
ata device crashing system in Fedora 8
http://www.experts-exchange.com/OS/Linux/Distributions/Fedora/Q_23125450.html

problème de mise à jour
http://forums.fedora-fr.org/viewtopic.php?pid=253930

Kernel 2.6.24.x boot problem - Anyone , Any idea
http://fcp.surfsite.org/modules/newbb/viewtopic.php?viewmode=flat&order=ASC&topic_id=54760&forum=10

Thought though with the newest hard drive with support of NCQ such is not present, ... also same:

"With this kernel I’m getting frequent temporary freezes (system comes back responsive after a minute or so…)."
http://kerneltrap.org/mailarchive/linux-kernel/2008/1/8/546296

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

Kevin P (kevin-cybercolloids) wrote :

I no longer have this problem after replacing the SATA data cable with a
better quality latch cable. Recently I rebuilt the computer with the
problem and used a cheap cable again, how quickly you forget - the
problem recurred. The computer is now running OK with a set of good
quality cables. The computer is in an office and is used every day, all
day so it is getting a good work out - and its stable.

Kevin.

Leann Ogasawara wrote:
> The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the
> upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would
> appreciate it if you could please test this newer 2.6.27 Ubuntu kernel.
> There are one of two ways you should be able to test:
>
> 1) If you are comfortable installing packages on your own, the linux-
> image-2.6.27-* package is currently available for you to install and
> test.
>
> --or--
>
> 2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer
> 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4.
> Please watch http://www.ubuntu.com/testing for Alpha5 to be announced.
> You should then be able to test via a LiveCD.
>
> Please let us know immediately if this newer 2.6.27 kernel resolves the
> bug reported here or if the issue remains. More importantly, please
> open a new bug report for each new bug/regression introduced by the
> 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please
> specifically note if the issue does or does not appear in the 2.6.26
> kernel. Thanks again, we really appreicate your help and feedback.
>
> ** Tags added: cft-2.6.27
>
>

Adam (adam.russell) wrote :

I am unable to contribute any further to this bug, as I am no longer using the hardware in question. I will be unsubscribing.

Christopher Berner (cberner) wrote :

I just started experiencing this bug after I upgraded from Hardy to Intrepid. Let me know if there is any information I can provide.

Per a decision made by the Ubuntu Kernel Team, bugs will longer be assigned to the ubuntu-kernel-team in Launchpad as part of the bug triage process. The ubuntu-kernel-team is being unassigned from this bug report. Refer to https://wiki.ubuntu.com/KernelTeamBugPolicies for more information. Thanks.

Download full text (3.4 KiB)

I've read through this and countless other posts about this issue (which I am also experiencing). It truly does render the computer useless.

The only big thing that jumps out at me in this thread is that everyone is mentioning problems with their CD/DVD drives, and upgrading firmware, etc. So... I don't even HAVE an optical drive installed on the system, and this problem happens every time I boot.

I do have an Intel chipset, and I was thinking I'd try blacklisting the ata piix module, but currently I am remote (over ssh), and the computer takes several minutes to respond even to shell commands. So I'd venture to say this isn't only a problem with optical drives. All I have is 3 internal SATA drives. No external USB, no optical, no floppy. I also tried stopping the HAL daemon, and initially it seemed JUST A LITTLE better, although the messages still show up in the syslog. I don't know if I should try to blacklist ata piix while I'm remote, as I've noticed in reading through posts that it might render the machine unbootable.

Anything else I can try? This is running Intrepid with all recent updates, 3 internal SATA drives, two of which are using mdadm for 3 RAID1 partitions and 1 RAID0 partition. And I've done countless diagnostics and all hardware appears normal. Also, I've removed the third (non-RAID) drive and the problem still happens. errors:

Jan 5 10:10:15 kaya kernel: [ 6132.945046] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jan 5 10:10:15 kaya kernel: [ 6132.945062] ata3.00: cmd c8/00:08:61:36:db/00:00:00:00:00/e5 tag 0 dma 4096 in
Jan 5 10:10:15 kaya kernel: [ 6132.945066] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 5 10:10:15 kaya kernel: [ 6132.945074] ata3.00: status: { DRDY }
Jan 5 10:10:15 kaya kernel: [ 6132.945090] ata3: soft resetting link
Jan 5 10:10:15 kaya kernel: [ 6133.181465] ata3.00: configured for UDMA/100
Jan 5 10:10:15 kaya kernel: [ 6133.181487] ata3: EH complete
Jan 5 10:10:15 kaya kernel: [ 6133.194973] sd 2:0:0:0: [sda] 234441648 512-byte hardware sectors (120034 MB)
Jan 5 10:10:15 kaya kernel: [ 6133.195229] sd 2:0:0:0: [sda] Write Protect is off
Jan 5 10:10:15 kaya kernel: [ 6133.195236] sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
Jan 5 10:10:15 kaya kernel: [ 6133.221708] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Jan 5 10:10:46 kaya kernel: [ 6163.908049] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jan 5 10:10:46 kaya kernel: [ 6163.908063] ata3.00: cmd c8/00:20:89:30:db/00:00:00:00:00/e5 tag 0 dma 16384 in
Jan 5 10:10:46 kaya kernel: [ 6163.908065] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 5 10:10:46 kaya kernel: [ 6163.908071] ata3.00: status: { DRDY }
Jan 5 10:10:46 kaya kernel: [ 6163.908083] ata3: soft resetting link
Jan 5 10:10:46 kaya kernel: [ 6164.144466] ata3.00: configured for UDMA/100
Jan 5 10:10:46 kaya kernel: [ 6164.144487] ata3: EH complete
Jan 5 10:10:46 kaya kernel: [ 6164.161017] sd 2:0:0:0: [sda] 234441648 512-byte hardware sectors (120034 MB)
Jan 5 10:10:46 kaya kernel: [ 6164.161303] sd 2:0:0:0: [sda] Write Pr...

Read more...

So it would appear that this is, in fact a hardware issue (at least in my case). Like I said, my mem, mobo, proc, and HDD's all were good. The one thing I never considered: power supply.

It would appear that one of my drives was consistently receiving too little power. It just occured to me out of the blue when I was sitting there dealing with this problem and heard what I thought were my fans spinning down, then back up again, within 2 seconds.

I had been using a splitter that splits one Molex 4-pin adapter into 2 SATA power adapter. I changed the wire configurations around so that each 4-pin Molex was only allocated one HDD per. Since then I show 2 days of system uptime with no recurrence of this problem.

I've since ordered a new power supply, as this one is most certainly about to die.

Thanks for the responses, everyone. Sorry to waste time and effort!

Changed in linux:
status: Confirmed → Invalid
Bryan Wu (cooloney) wrote :

As sideshowmel reported, this bug is invalidate at all. So close it to invalid.

-Bryan

Changed in linux (Ubuntu):
status: Confirmed → Invalid
kamahat (kamahat) wrote :

Same probleme and I had to wait a long time to find a solution : flashing the firmware of my latop cdrom.

I've got an ACER Aspire 9410, the cdrom is a TSSTCorp TS-L632D
And acer only deliver 1 firmware : AC01

As stated on the other post, crossing flashing is okay, I've flashed witch :" SC04 - Original Samsung Computer Firmware "
And all my probleme goes away

a source to find firmwares : http://backfire.rpc1.org/tsstcorp/index.php?path=TS-L632D/

PS : to crossflash I've done it under windows with the binary suplied "sfdnwin " and the option "-nocheck"
some information here also : http://forum.rpc1.org/viewtopic.php?p=37412#37417

Changed in linux:
importance: Unknown → High
Displaying first 40 and last 40 comments. View all 142 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.