gutsy kernel is causing data-loss.somehow related to SATA (media-error)

Bug #151938 reported by ubuntu_demon
12
Affects Status Importance Assigned to Milestone
linux-source-2.6.22 (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

gutsy kernel is causing data-loss.somehow related to SATA (media-error)

I was dual-booting Feisty and Gutsy during the last week of july. After a couple of days I discovered these kind of errors in my Feisty syslog :

Jul 30 21:43:37 ubuntu kernel: [ 882.592000] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Jul 30 21:43:37 ubuntu kernel: [ 882.592000] ata1.00: (BMDMA stat 0x25)
Jul 30 21:43:37 ubuntu kernel: [ 882.592000] ata1.00: cmd c8/00:40:b8:29:b8/00:00:00:00:00/e5 tag 0 cdb 0x0 data 32768 in
Jul 30 21:43:37 ubuntu kernel: [ 882.592000] res 51/40:40:b8:29:b8/00:00:00:00:00/e5 Emask 0x9 (media error)
Jul 30 21:43:37 ubuntu kernel: [ 882.600000] ata1.00: ata_hpa_resize 1: sectors = 156368016, hpa_sectors = 156368016
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] ata1.00: ata_hpa_resize 1: sectors = 156368016, hpa_sectors = 156368016
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] ata1.00: configured for UDMA/133
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] sd 0:0:0:0: SCSI error: return code = 0x08000002
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] sda: Current [descriptor]: sense key: Medium Error
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Additional sense: Unrecovered read error - auto reallocate failed
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Descriptor sense data with sense descriptors (in hex):
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] 05 b8 29 b8
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] end_request: I/O error, dev sda, sector 95955384
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442652
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442653
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442654
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442655
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442656
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442657
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442658
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442659
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442660
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] Buffer I/O error on device sda3, logical block 28442661
Jul 30 21:43:37 ubuntu kernel: [ 882.608000] ata1: EH complete

fsck was reporting all kind of problems. files turned up in lost+found. I was afraid my harddrive was dying.

To be sure this problem was not related to Gutsy in some way I stopped booting in Gutsy and I set fsck to check my harddisk daily.
The problems stopped! I have continued to fsck my harddisk regularly until I was sure there's nothing wrong with my harddrive (I did so for two months).

A couple of days ago (saturday) I installed Gutsy again (Gutsy Beta). I did an installation of Gutsy on the same partition I had run Gutsy on two months before. Again I was dualbooting between Feisty and Gutsy. This time I made sure to only mount the Gutsy partition in Gutsy and no other partitions.

The same type of error showed up in my Feisty syslog (I had /dev/sda1 my Gutsy parititon mounted) :

Oct 11 11:50:03 T-2500 kernel: [ 15.756000] res 51/40:08:18:e3:74/00:00:00:00:00/e1 Emask 0x9 (media error)
Oct 11 11:50:03 T-2500 kernel: [ 17.088000] res 51/40:08:18:e3:74/00:00:00:00:00/e1 Emask 0x9 (media error)
Oct 11 11:50:03 T-2500 kernel: [ 18.400000] res 51/40:08:18:e3:74/00:00:00:00:00/e1 Emask 0x9 (media error)
Oct 11 11:50:03 T-2500 kernel: [ 19.720000] res 51/40:08:18:e3:74/00:00:00:00:00/e1 Emask 0x9 (media error)
Oct 11 11:50:03 T-2500 kernel: [ 21.032000] res 51/40:08:18:e3:74/00:00:00:00:00/e1 Emask 0x9 (media error)
Oct 11 11:50:03 T-2500 kernel: [ 22.352000] res 51/40:08:18:e3:74/00:00:00:00:00/e1 Emask 0x9 (media error)
Oct 11 11:50:03 T-2500 kernel: [ 22.376000] sd 0:0:0:0: SCSI error: return code = 0x08000002
Oct 11 11:50:03 T-2500 kernel: [ 22.376000] Additional sense: Unrecovered read error - auto reallocate failed
Oct 11 11:50:03 T-2500 kernel: [ 22.376000] end_request: I/O error, dev sda, sector 24437528

Maybe this bug is related to the following this one ??? :
https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.22/+bug/145691

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Feisty syslog of july 30th

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Feisty syslog of October 11th

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

smartmontools diskhealth (gutsy) :

$ sudo smartctl -H /dev/sda
smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

smartmontools diskinformation (gutsy) :

$ sudo smartctl -i /dev/sda
smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model: SAMSUNG HM080JI
Serial Number: S082J10YA48024
Firmware Version: YC100-04
User Capacity: 80,060,424,192 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0
Local Time is: Fri Oct 12 12:44:11 2007 CEST

==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for
details.

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy kern.log

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy kern.log.0

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy lspci -vvnn

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy
cat /proc/version_signature

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy dmesg

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy uname -a

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy dmesg

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy dmesg.0

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy dmesg.1.gz

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy dmesg.2.gz

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy dmesg.3.gz

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy dmesg.4.gz

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy syslog

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy syslog.0

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy syslog.1.gz

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy syslog.2.gz

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy syslog.3.gz

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy syslog.4.gz

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Gutsy syslog.5.gz

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

More information about my laptop (I haven't updated it in a while) : https://wiki.ubuntu.com/LaptopTestingTeam/AhtecStyleX30Duo

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

I heard that Gutsy has a new SATA subsystem. Maybe that's related to this problem ?

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

I'm not sure whether I get these media-error bugs in Gutsy too.

Is there any relevant additional information I should add ?

I will be on #ubuntu-devel , #ubuntu-bugs and #ubuntu-kernel for a bit tonight

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

I'm not sure whether I get these media-error bugs in Gutsy too.

My guess about what's happening :

Gutsy interacts in some different way with SATA drives. Gutsy puts my SATA drive in some "mode" which works fine for Gutsy but is remembered through soft-resets of the machine. Feisty can't handle this "mode" and causes the filesystem to corrupt.

Revision history for this message
Matthew Garrett (mjg59) wrote :

Additional sense: Unrecovered read error - auto reallocate failed

indicates that your drive is claiming that it has physical errors. Can you reproduce this with another drive?

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

to Matthew Garrett :

It's my laptop that is dual booting Feisty and Gutsy so I can't easily switch drives. (I might be voiding warranty and I don't have any 2.5" drives lying around)

I'm totally sure that this problem is related to Gutsy. When I didn't boot Gutsy at all during about 2 months I didn't encounter any problems.

fsck didn't produce any errors this time. Turning the laptop off and on again (hard reset) made the media errors disappear.

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

/sda1 Gutsy
/sda2 Feisty
/sda3 home

all three partitions experienced media errors in Feisty around july 30th

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

I have seen the the media errors in Feisty. (I can't remember whether I have also seen them in Gutsy). The result of these media errors was that fsck was reporting problems (and files were ending up in lost+found).

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Both Gutsy and Feisty see the disk as the same size

Gutsy (I catted the dmesg from Feisty) :

$ cat dmesg | grep sectors
[ 5.692000] ata1.00: 156368016 sectors, multi 16: LBA48 NCQ (depth 0/32)
[ 6.216000] sd 0:0:0:0: [sda] 156368016 512-byte hardware sectors (80060 MB)
[ 6.216000] sd 0:0:0:0: [sda] 156368016 512-byte hardware sectors (80060 MB)

Feisty :

$ dmesg | grep sectors
[ 4.028000] ata1.00: ata_hpa_resize 1: sectors = 156368016, hpa_sectors = 156368016
[ 4.028000] ata1.00: 156368016 sectors, multi 16: LBA48 NCQ (depth 0/32)
[ 4.036000] ata1.00: ata_hpa_resize 1: sectors = 156368016, hpa_sectors = 156368016
[ 4.524000] SCSI device sda: 156368016 512-byte hdwr sectors (80060 MB)
[ 4.524000] SCSI device sda: 156368016 512-byte hdwr sectors (80060 MB)

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

I can find hpa in my Feisty dmesg but not in my Gutsy dmesg.

Feisty :
$ dmesg | grep ata_hpa
[ 4.028000] ata1.00: ata_hpa_resize 1: sectors = 156368016, hpa_sectors = 156368016
[ 4.036000] ata1.00: ata_hpa_resize 1: sectors = 156368016, hpa_sectors = 156368016

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

According to Ben Collins hpa isn't my problem because I don't have a "host protected area".

About not showing up of hpa in my Gutsy dmesg :
"BenC: ubuntu_demon: just a missing printk in gutsy's stock hpa patch"

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

Personally I don't think my harddrive is dying because :

* smartctl says that my harddisk health is okay
* When only booting Feisty for 2 months I didn't experience any problems
* My laptop and harddrive are about 1 year old.

Revision history for this message
Tobias Heinemann (theine) wrote :

I'm getting the same error messages. smartctl output:

smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model: TOSHIBA MK1237GSX
Serial Number: 67JGTG1IT
Firmware Version: DL140D
User Capacity: 120,034,123,776 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 7
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Sat Oct 13 13:38:29 2007 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Revision history for this message
Tobias Heinemann (theine) wrote :

I'm using the AHCI interface by the way.

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

to Tobias Heinemann :

What version of Ubuntu are you running ?

Are you dual-booting Feisty and Gutsy ? Do you think your problem is related to dual-booting Feisty and Gutys ?

What's the output of (replace /dev/sda with your harddrive) :
$ sudo smartctl -H /dev/sda

Thanks

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

to Tobias Heinemann :

I'm ubuntu_demon on IRC

Revision history for this message
Tobias Heinemann (theine) wrote :

I'm running Gutsy. Not dual-booting.

Output of smartctl -H /dev/sda:

smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

Cheers

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

to Tobias Heinemann :

How old is your harddrive ?
When did you start running Gutsy ?
When did your problems start ?
Did your problems start after running Gutsy ? If that's the case how long were you running Gutsy before these media errors were appearing ?
Did you have the problem already in Feisty ?

Revision history for this message
Gerardo Cruz (gcruz) wrote :

I was having the same problem as you guys. It started with Feisty, like two months ago or so. Everything would slow down or even freeze (not always the mouse).

I stopped using the computer as I thought the hard drive was dying, but then, one week ago I used some app in the Ultimate Boot CD that found some errors on my hard drive and it let me erase it all in order to repair it (this tool was specific for my Hitachi hard disk). It took a while, but now everything is working fine. I had Feisty/WinXp before (never booted Windows, though) and now Gutsy/WinXp (idem).

Smartctl says the test is PASSED, too.

Revision history for this message
Tobias Heinemann (theine) wrote :

My hard drive is pretty new. I got this laptop about 3 month ago.

I've been running Gutsy since around Tribe 2 I guess. That's just a rough estimate though.

These error messages appeared when I switched on AHCI mode in BIOS, which I did about two weeks ago.

Before that, I never saw these error messages -- neither with Gutsy nor with Feisty.

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

to Gerardo Cruz :

So when you first started having your problem you were running Feisty and you had run Feisty already for some time without any problems ? And you didn't run Gutsy or dual boot with Gutsy when your problems started ?

If that's the case the application you used probably marked some bad sectors as not usable. Your harddrive might be slowly dying and other bad sectors will show up eventually. if that's the case you should probably switch harddrives to prevent data-loss.

My problem seems to correlate together with running Gutsy that's why I think Gutsy might be the reason. Maybe Tobias Heinemann has the same problem.

Maybe all three of us are unlucky and all our drives are dying. Let's try to gather some more information first.

Revision history for this message
Gerardo Cruz (gcruz) wrote :

-- So when you first started having your problem you were running Feisty and you had run Feisty already for some time without any problems ?
Yes, that's right.

-- And you didn't run Gutsy or dual boot with Gutsy when your problems started ?
No, I didn't.

My hard drive is two years and a half old. I wouldn't be surprised if it was dying. I'll keep using it as long as it is stable. Then I'll think about replacing it or buying a new laptop. I'm subscribing to this bug, in case you need more info. Good luck.

Revision history for this message
Marius Gedminas (mgedmin) wrote :

I would suggest looking at the SMART error log and running the SMART self-tests instead of just relying on the overall self-assessment

Revision history for this message
Loïc Minier (lool) wrote :

Hi,

I was the victim of similar error messages when my hard disk was /starting/ to die, and even after happening for some weeks, the hard disk would report no error in smartctl (well some, but about as many as when the hard disk was new, and this is supposedly normal as even brand new hard disk have defective sectors).

When it finally died, the hard disk test program of my disk manufacturer (Hitachi) reported this clearly, the program wouldn't find any error before I got some serious I/O errors.

Could you please try running your hard disk manufacturer's test program? For Samsung this might be:
http://www.samsung.com/global/business/hdd/support/utilities/Support_HUTIL.html

Thanks,

Revision history for this message
Dread Knight (dread.knight) wrote :

I have this sort of problems too with gutsy.

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

to Dread Knight :

How old is your harddrive ?
When did you start running Gutsy ?
When did your problems start ?
Did your problems start after running Gutsy ? If that's the case how long were you running Gutsy before these media errors were appearing ?
Did you have the problem already in Feisty ?
Are you dual-booting with Feisty ?
What is the output of smartctl -H /dev/sda ?

Revision history for this message
Josip Lazic (josip) wrote :

I too have this problem. My English is not very good i hope you will understand....

My old Seagate ST3250820AS (xfs) started to fail, so i ran xfs_repair -L, i could not finish because of I/O errors, so I brought new disk (Seagate-Maxtor STM3250310AS) and run:

dd_rescue -v /dev/sde1 /dev/sdd1

After that i managed to rescue all of my data. Than I formated old disk into ext3, and copy all of my data to that disk (just so i have backup). After just one day again same errors, but this time on brand new disk! I thought that problem is in XFS, so i formated New disk into ext3, and copy all data from old disk to new with cp -a. This did not end nicely, because allot of I/O errors. OK, I said, i will clone Old disk to New with dd, so i did that. But if i run fsck.ext3 on old disk it finishes without errors, but on new disk there is allot of errors. Shouldn't cloned disk need to be identical?

Motherboard is Abit NF7-S (RAID bus controller: Silicon Image, Inc. SiI 3112 [SATALink/SATARaid] Serial ATA Controller (rev 02))
DDR RAM (2x512MB) is OK (i run Memtest)

Before Gutsy i had Gentoo, and everything worked OK for a long time.

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

to Josip Lazic :

How old is your new harddrive (Seagate-Maxtor STM3250310AS) ?
When did you start running Gutsy ?
When did your problems start ?
Did your problems start after running Gutsy ? If that's the case how long were you running Gutsy before these media errors were appearing ?
Are you dual-booting with Feisty ?
What is the output of smartctl -H /dev/sda ?

Revision history for this message
Josip Lazic (josip) wrote :

to ubuntu_demon:

After long testing i came to conclusion that problem is my Sil3112A SATA controller, and after replacing motherboard everything works fine.

Sorry because false alarm.

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

I have tested my SAMSUNG HM080JI harddrive with HUTIL 2.03 (included on Ultimate Boot CD 4.11). All tests passed (including S.M.A.R.T.) except the last test (entire surface scan) which found a couple of ecc errors.

HUTIL suggested to erase the HDD and scan again for errors. So I'm going to do that. If I see errors again I'm going to use my warranty to get a new harddrive. If I don't see errors again I'm going to install Gutsy and use HUTIL regularly to scan for errors. If I see errors again I'm going to use my warranty to get a new harddrive. If I don't see errors again it might still be some strange bug.

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

I'm going to use the HUTIL "Erase HDD" tool which does a low level format.

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

HUTIL 2.03 (included on Ultimate Boot CD 4.11) confirms that the harddisk is slowly dying

Changed in linux-source-2.6.22:
status: New → Invalid
Revision history for this message
Rafiot (raf51) wrote :

My english is very poor but I hope that I can help.

I had the same problem but I sawn this warnings too late and after 1 week with Gutsy was my Disk really destroyed... I use Ubuntu since Edgy and hat no Problems. My laptop (acer aspire 1692) had 2 years.

Revision history for this message
viper233 (viper233) wrote :
Download full text (3.8 KiB)

Same error has just appeared

Linux Machine-Name 2.6.22-14-generic #1 SMP Sun Oct 14 23:05:12 GMT 2007 i686 GNU/Linux

It went to do a monthly check on may raid 1 device

 1 Time(s): [473365.224795] md: data-check of RAID array md0
 1 Time(s): [473365.224800] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
 1 Time(s): [473365.224802] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
 1 Time(s): [473365.224807] md: using 128k window, over a total of 488383936 blocks.
 1 Time(s): [473578.174071] ata2.00: exception Emask 0x2 SAct 0xdc3f SErr 0x0 action 0x2 frozen
 1 Time(s): [473578.174076] ata2.00: (spurious completions during NCQ issue=0x0 SAct=0xdc3f FIS=004040a1:00000040)
 1 Time(s): [473578.174080] ata2.00: cmd 60/80:00:3f:92:e2/00:00:01:00:00/40 tag 0 cdb 0x0 data 65536 in
 1 Time(s): [473578.174081] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
 1 Time(s): [473578.174084] ata2.00: cmd 60/80:08:bf:40:df/00:00:01:00:00/40 tag 1 cdb 0x0 data 65536 in
 1 Time(s): [473578.174085] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
 1 Time(s): [473578.174088] ata2.00: cmd 60/80:10:3f:61:e2/00:00:01:00:00/40 tag 2 cdb 0x0 data 65536 in
 1 Time(s): [473578.174089] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
 1 Time(s): [473578.174092] ata2.00: cmd 60/80:18:bf:e4:df/00:00:01:00:00/40 tag 3 cdb 0x0 data 65536 in
 1 Time(s): [473578.174093] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
 1 Time(s): [473578.174097] ata2.00: cmd 60/80:20:3f:8f:e2/00:00:01:00:00/40 tag 4 cdb 0x0 data 65536 in
 1 Time(s): [473578.174098] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
 1 Time(s): [473578.174101] ata2.00: cmd 60/80:28:3f:90:e2/00:00:01:00:00/40 tag 5 cdb 0x0 data 65536 in
 1 Time(s): [473578.174102] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
 1 Time(s): [473578.174105] ata2.00: cmd 60/80:50:bf:8f:e2/00:00:01:00:00/40 tag 10 cdb 0x0 data 65536 in
 1 Time(s): [473578.174106] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
 1 Time(s): [473578.174109] ata2.00: cmd 60/80:58:bf:90:e2/00:00:01:00:00/40 tag 11 cdb 0x0 data 65536 in
 1 Time(s): [473578.174110] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
 1 Time(s): [473578.174113] ata2.00: cmd 60/80:60:3f:91:e2/00:00:01:00:00/40 tag 12 cdb 0x0 data 65536 in
 1 Time(s): [473578.174114] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
 1 Time(s): [473578.174117] ata2.00: cmd 60/80:70:bf:8e:e2/00:00:01:00:00/40 tag 14 cdb 0x0 data 65536 in
 1 Time(s): [473578.174118] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
 1 Time(s): [473578.174122] ata2.00: cmd 60/80:78:bf:91:e2/00:00:01:00:00/40 tag 15 cdb 0x0 data 65536 in
 1 Time(s): [473578.174123] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
 1 Time(s): [473578.481880] ata2: soft resetting port
 1 Time(s): [473578.653440] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
 1 Time(s): [473578.745461] ata2.00: c...

Read more...

Revision history for this message
hovang (dan-hovang) wrote :
Download full text (5.2 KiB)

I also have these messages in the log.

My drive is approx 1 months old. It's connected in RAID5 array with 4 disks with different brands. The disk that reports errors is the first in the array, but the second in the system (/dev/sdb). It's a Seagate Barracuda 320 GB 7.200 RPM SATA/300 disk.

The system is Ubuntu 7.04 server, recently upgraded to 7.10 using apt-get dist-upgrade. I have not had the messages before upgrade to 7.10. On the other hand, I upgraded only few weeks ago in conjunction with installing the RAID system. There are no other operating systems on the same computer.

Here is the output of smartctl:

dan@vodka:~$ sudo smartctl -H /dev/sdb
smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

SMART Health Status: OK

I've had the messages at least twice. Both times the same drive (/dev/sdb). Here are the latest log messages.

Nov 16 19:30:06 vodka kernel: [438824.130744] res 51/40:00:6a:ba:c5/40:00:16:00:00/e4 Emask 0x9 (media error)
Nov 16 19:30:06 vodka kernel: [438824.181477] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:06 vodka kernel: [438824.264642] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:06 vodka kernel: [438824.264651] ata3.00: configured for UDMA/133
Nov 16 19:30:06 vodka kernel: [438824.264661] ata3: EH complete
Nov 16 19:30:08 vodka kernel: [438826.193215] res 51/40:00:6a:ba:c5/40:00:16:00:00/e4 Emask 0x9 (media error)
Nov 16 19:30:08 vodka kernel: [438826.252267] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:08 vodka kernel: [438826.327114] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:08 vodka kernel: [438826.327124] ata3.00: configured for UDMA/133
Nov 16 19:30:08 vodka kernel: [438826.327137] ata3: EH complete
Nov 16 19:30:10 vodka kernel: [438828.247367] res 51/40:00:6a:ba:c5/40:00:16:00:00/e4 Emask 0x9 (media error)
Nov 16 19:30:10 vodka kernel: [438828.306421] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:10 vodka kernel: [438828.381263] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:10 vodka kernel: [438828.381270] ata3.00: configured for UDMA/133
Nov 16 19:30:10 vodka kernel: [438828.381280] ata3: EH complete
Nov 16 19:30:12 vodka kernel: [438830.301525] res 51/40:00:6a:ba:c5/40:00:16:00:00/e4 Emask 0x9 (media error)
Nov 16 19:30:12 vodka kernel: [438830.343941] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:12 vodka kernel: [438830.418787] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:12 vodka kernel: [438830.418794] ata3.00: configured for UDMA/133
Nov 16 19:30:12 vodka kernel: [438830.418804] ata3: EH complete
Nov 16 19:30:14 vodka kernel: [438832.339049] res 51/40:00:6a:ba:c5/40:00:16:00:00/e4 Emask 0x9 (media error)
Nov 16 19:30:14 vodka kernel: [438832.381464] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:14 vodka kernel: [438832.447993...

Read more...

Revision history for this message
ubuntu_demon (ubuntu-demon) wrote :

to : hovang and viper233

Your disks are probably starting to die. Start backing up your data.

You should use the tool recommended by your harddisk manufacturer to make sure. If errors are found then do a low-level format using the tool recommended by your harddisk manufacturer. Then check again for errors. If you can still find errors the second time then it's probably time to start shopping for a new harddrive.

Revision history for this message
wangrui (cnsdqdwangrui) wrote :

Hi ubuntu_demon ,

I have the similar problem. When I copied one file to another disk, I got an input output error. Then I find whenever I run the following command

sudo dd if=/dev/sdb bs=512 skip=1382533521 count=1 | hexdump -C

I will get input/output error. But I just bought this hard disk for several weeks. And I did a disk scan when bought it. And the hard disk is physically protected well. Does this mean I have to return it? As I tested, only sector 1382533521 to 1382533527 has this problem. Can I just stop using this sectors? Or does this means other sectors will most likely to have the same kind of problem in the near future and I should exchange for another disk?

Thanks

Here is the kernel message,

[33528.071838] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[33528.071846] ata6.00: BMDMA stat 0x24
[33528.071853] ata6.00: cmd 25/00:08:8c:c9:67/00:00:52:00:00/e0 tag 0 dma 4096 in
[33528.071855] res 51/40:00:91:c9:67/40:00:52:00:00/00 Emask 0x9 (media error)
[33528.071859] ata6.00: status: { DRDY ERR }
[33528.071862] ata6.00: error: { UNC }
[33528.182483] ata6.00: configured for UDMA/133
[33528.182500] sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
[33528.182505] sd 5:0:0:0: [sdb] Sense Key : Medium Error [current] [descriptor]
[33528.182511] Descriptor sense data with sense descriptors (in hex):
[33528.182513] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[33528.182523] 52 67 c9 91
[33528.182527] sd 5:0:0:0: [sdb] Add. Sense: Unrecovered read error - auto reallocate failed
[33528.182535] end_request: I/O error, dev sdb, sector 1382533521
[33528.182552] ata6: EH complete
[33528.184128] sd 5:0:0:0: [sdb] 1953525168 512-byte hardware sectors (1000205 MB)
[33528.184303] sd 5:0:0:0: [sdb] Write Protect is off
[33528.184307] sd 5:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[33528.184541] sd 5:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[33528.184751] sd 5:0:0:0: [sdb] 1953525168 512-byte hardware sectors (1000205 MB)
[33528.184915] sd 5:0:0:0: [sdb] Write Protect is off
[33528.184919] sd 5:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[33528.185150] sd 5:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.