Bug #151938 “gutsy kernel is causing data-loss.somehow related t...” : Bugs : linux-source-2.6.22 package : Ubuntu

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#1

feisty-july30th-syslog Edit (242.2 KiB, text/plain)

Feisty syslog of july 30th

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#2

feist-oct11th-syslog Edit (497.8 KiB, text/plain)

Feisty syslog of October 11th

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#3

smartmontools diskhealth (gutsy) :

$ sudo smartctl -H /dev/sda
smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#4

smartmontools diskinformation (gutsy) :

$ sudo smartctl -i /dev/sda
smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model: SAMSUNG HM080JI
Serial Number: S082J10YA48024
Firmware Version: YC100-04
User Capacity: 80,060,424,192 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0
Local Time is: Fri Oct 12 12:44:11 2007 CEST

==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for
details.

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#5

kern.log Edit (556.8 KiB, text/plain)

Gutsy kern.log

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#6

kern.log.0 Edit (117.9 KiB, text/plain)

Gutsy kern.log.0

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#7

lspci-vvnn.log Edit (16.0 KiB, text/plain)

Gutsy lspci -vvnn

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#8

version.log Edit (28 bytes, text/plain)

Gutsy
cat /proc/version_signature

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#9

dmesg.log Edit (25.7 KiB, text/plain)

Gutsy dmesg

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#10

uname-a.log Edit (82 bytes, text/plain)

Gutsy uname -a

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#11

dmesg Edit (24.2 KiB, text/plain)

Gutsy dmesg

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#12

dmesg.0 Edit (24.4 KiB, text/plain)

Gutsy dmesg.0

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#13

dmesg.1.gz Edit (7.6 KiB, application/octet-stream)

Gutsy dmesg.1.gz

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#14

dmesg.2.gz Edit (7.7 KiB, application/octet-stream)

Gutsy dmesg.2.gz

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#15

dmesg.3.gz Edit (7.6 KiB, application/octet-stream)

Gutsy dmesg.3.gz

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#16

dmesg.4.gz Edit (7.7 KiB, application/octet-stream)

Gutsy dmesg.4.gz

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#17

syslog Edit (185.5 KiB, text/plain)

Gutsy syslog

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#18

syslog.0 Edit (87.8 KiB, text/plain)

Gutsy syslog.0

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#19

syslog.1.gz Edit (48.3 KiB, application/octet-stream)

Gutsy syslog.1.gz

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#20

syslog.2.gz Edit (35.5 KiB, application/octet-stream)

Gutsy syslog.2.gz

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#21

syslog.3.gz Edit (13.5 KiB, application/octet-stream)

Gutsy syslog.3.gz

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#22

syslog.4.gz Edit (28.0 KiB, application/octet-stream)

Gutsy syslog.4.gz

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#23

syslog.5.gz Edit (31.9 KiB, application/octet-stream)

Gutsy syslog.5.gz

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#24

More information about my laptop (I haven't updated it in a while) : https://wiki.ubuntu.com/LaptopTestingTeam/AhtecStyleX30Duo

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#25

I heard that Gutsy has a new SATA subsystem. Maybe that's related to this problem ?

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#26

I'm not sure whether I get these media-error bugs in Gutsy too.

Is there any relevant additional information I should add ?

I will be on #ubuntu-devel , #ubuntu-bugs and #ubuntu-kernel for a bit tonight

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#27

I'm not sure whether I get these media-error bugs in Gutsy too.

My guess about what's happening :

Gutsy interacts in some different way with SATA drives. Gutsy puts my SATA drive in some "mode" which works fine for Gutsy but is remembered through soft-resets of the machine. Feisty can't handle this "mode" and causes the filesystem to corrupt.

Revision history for this message

Matthew Garrett (mjg59) wrote on 2007-10-12:

#28

Additional sense: Unrecovered read error - auto reallocate failed

indicates that your drive is claiming that it has physical errors. Can you reproduce this with another drive?

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#29

to Matthew Garrett :

It's my laptop that is dual booting Feisty and Gutsy so I can't easily switch drives. (I might be voiding warranty and I don't have any 2.5" drives lying around)

I'm totally sure that this problem is related to Gutsy. When I didn't boot Gutsy at all during about 2 months I didn't encounter any problems.

fsck didn't produce any errors this time. Turning the laptop off and on again (hard reset) made the media errors disappear.

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#30

/sda1 Gutsy
/sda2 Feisty
/sda3 home

all three partitions experienced media errors in Feisty around july 30th

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#31

I have seen the the media errors in Feisty. (I can't remember whether I have also seen them in Gutsy). The result of these media errors was that fsck was reporting problems (and files were ending up in lost+found).

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#32

Both Gutsy and Feisty see the disk as the same size

Gutsy (I catted the dmesg from Feisty) :

$ cat dmesg | grep sectors
[ 5.692000] ata1.00: 156368016 sectors, multi 16: LBA48 NCQ (depth 0/32)
[ 6.216000] sd 0:0:0:0: [sda] 156368016 512-byte hardware sectors (80060 MB)
[ 6.216000] sd 0:0:0:0: [sda] 156368016 512-byte hardware sectors (80060 MB)

Feisty :

$ dmesg | grep sectors
[ 4.028000] ata1.00: ata_hpa_resize 1: sectors = 156368016, hpa_sectors = 156368016
[ 4.028000] ata1.00: 156368016 sectors, multi 16: LBA48 NCQ (depth 0/32)
[ 4.036000] ata1.00: ata_hpa_resize 1: sectors = 156368016, hpa_sectors = 156368016
[ 4.524000] SCSI device sda: 156368016 512-byte hdwr sectors (80060 MB)
[ 4.524000] SCSI device sda: 156368016 512-byte hdwr sectors (80060 MB)

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#33

I can find hpa in my Feisty dmesg but not in my Gutsy dmesg.

Feisty :
$ dmesg | grep ata_hpa
[ 4.028000] ata1.00: ata_hpa_resize 1: sectors = 156368016, hpa_sectors = 156368016
[ 4.036000] ata1.00: ata_hpa_resize 1: sectors = 156368016, hpa_sectors = 156368016

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#34

According to Ben Collins hpa isn't my problem because I don't have a "host protected area".

About not showing up of hpa in my Gutsy dmesg :
"BenC: ubuntu_demon: just a missing printk in gutsy's stock hpa patch"

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-12:

#35

Personally I don't think my harddrive is dying because :

* smartctl says that my harddisk health is okay
* When only booting Feisty for 2 months I didn't experience any problems
* My laptop and harddrive are about 1 year old.

Revision history for this message

Tobias Heinemann (theine) wrote on 2007-10-13:

#36

I'm getting the same error messages. smartctl output:

smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model: TOSHIBA MK1237GSX
Serial Number: 67JGTG1IT
Firmware Version: DL140D
User Capacity: 120,034,123,776 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 7
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Sat Oct 13 13:38:29 2007 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Revision history for this message

Tobias Heinemann (theine) wrote on 2007-10-13:

#37

I'm using the AHCI interface by the way.

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-13:

#38

to Tobias Heinemann :

What version of Ubuntu are you running ?

Are you dual-booting Feisty and Gutsy ? Do you think your problem is related to dual-booting Feisty and Gutys ?

What's the output of (replace /dev/sda with your harddrive) :
$ sudo smartctl -H /dev/sda

Thanks

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-13:

#39

to Tobias Heinemann :

I'm ubuntu_demon on IRC

Revision history for this message

Tobias Heinemann (theine) wrote on 2007-10-13:

#40

I'm running Gutsy. Not dual-booting.

Output of smartctl -H /dev/sda:

smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

Cheers

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-13:

#41

to Tobias Heinemann :

How old is your harddrive ?
When did you start running Gutsy ?
When did your problems start ?
Did your problems start after running Gutsy ? If that's the case how long were you running Gutsy before these media errors were appearing ?
Did you have the problem already in Feisty ?

Revision history for this message

Gerardo Cruz (gcruz) wrote on 2007-10-13:

#42

I was having the same problem as you guys. It started with Feisty, like two months ago or so. Everything would slow down or even freeze (not always the mouse).

I stopped using the computer as I thought the hard drive was dying, but then, one week ago I used some app in the Ultimate Boot CD that found some errors on my hard drive and it let me erase it all in order to repair it (this tool was specific for my Hitachi hard disk). It took a while, but now everything is working fine. I had Feisty/WinXp before (never booted Windows, though) and now Gutsy/WinXp (idem).

Smartctl says the test is PASSED, too.

Revision history for this message

Tobias Heinemann (theine) wrote on 2007-10-13:

#43

My hard drive is pretty new. I got this laptop about 3 month ago.

I've been running Gutsy since around Tribe 2 I guess. That's just a rough estimate though.

These error messages appeared when I switched on AHCI mode in BIOS, which I did about two weeks ago.

Before that, I never saw these error messages -- neither with Gutsy nor with Feisty.

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-13:

#44

to Gerardo Cruz :

So when you first started having your problem you were running Feisty and you had run Feisty already for some time without any problems ? And you didn't run Gutsy or dual boot with Gutsy when your problems started ?

If that's the case the application you used probably marked some bad sectors as not usable. Your harddrive might be slowly dying and other bad sectors will show up eventually. if that's the case you should probably switch harddrives to prevent data-loss.

My problem seems to correlate together with running Gutsy that's why I think Gutsy might be the reason. Maybe Tobias Heinemann has the same problem.

Maybe all three of us are unlucky and all our drives are dying. Let's try to gather some more information first.

Revision history for this message

Gerardo Cruz (gcruz) wrote on 2007-10-13:

#45

-- So when you first started having your problem you were running Feisty and you had run Feisty already for some time without any problems ?
Yes, that's right.

-- And you didn't run Gutsy or dual boot with Gutsy when your problems started ?
No, I didn't.

My hard drive is two years and a half old. I wouldn't be surprised if it was dying. I'll keep using it as long as it is stable. Then I'll think about replacing it or buying a new laptop. I'm subscribing to this bug, in case you need more info. Good luck.

Revision history for this message

Marius Gedminas (mgedmin) wrote on 2007-10-13:

#46

I would suggest looking at the SMART error log and running the SMART self-tests instead of just relying on the overall self-assessment

Revision history for this message

Loïc Minier (lool) wrote on 2007-10-13:

#47

Hi,

I was the victim of similar error messages when my hard disk was /starting/ to die, and even after happening for some weeks, the hard disk would report no error in smartctl (well some, but about as many as when the hard disk was new, and this is supposedly normal as even brand new hard disk have defective sectors).

When it finally died, the hard disk test program of my disk manufacturer (Hitachi) reported this clearly, the program wouldn't find any error before I got some serious I/O errors.

Could you please try running your hard disk manufacturer's test program? For Samsung this might be:
http://www.samsung.com/global/business/hdd/support/utilities/Support_HUTIL.html

Thanks,

Revision history for this message

Dread Knight (dread.knight) wrote on 2007-10-13:

#48

I have this sort of problems too with gutsy.

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-14:

#49

to Dread Knight :

How old is your harddrive ?
When did you start running Gutsy ?
When did your problems start ?
Did your problems start after running Gutsy ? If that's the case how long were you running Gutsy before these media errors were appearing ?
Did you have the problem already in Feisty ?
Are you dual-booting with Feisty ?
What is the output of smartctl -H /dev/sda ?

Revision history for this message

Josip Lazic (josip) wrote on 2007-10-14:

#50

I too have this problem. My English is not very good i hope you will understand....

My old Seagate ST3250820AS (xfs) started to fail, so i ran xfs_repair -L, i could not finish because of I/O errors, so I brought new disk (Seagate-Maxtor STM3250310AS) and run:

dd_rescue -v /dev/sde1 /dev/sdd1

After that i managed to rescue all of my data. Than I formated old disk into ext3, and copy all of my data to that disk (just so i have backup). After just one day again same errors, but this time on brand new disk! I thought that problem is in XFS, so i formated New disk into ext3, and copy all data from old disk to new with cp -a. This did not end nicely, because allot of I/O errors. OK, I said, i will clone Old disk to New with dd, so i did that. But if i run fsck.ext3 on old disk it finishes without errors, but on new disk there is allot of errors. Shouldn't cloned disk need to be identical?

Motherboard is Abit NF7-S (RAID bus controller: Silicon Image, Inc. SiI 3112 [SATALink/SATARaid] Serial ATA Controller (rev 02))
DDR RAM (2x512MB) is OK (i run Memtest)

Before Gutsy i had Gentoo, and everything worked OK for a long time.

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-16:

#51

to Josip Lazic :

How old is your new harddrive (Seagate-Maxtor STM3250310AS) ?
When did you start running Gutsy ?
When did your problems start ?
Did your problems start after running Gutsy ? If that's the case how long were you running Gutsy before these media errors were appearing ?
Are you dual-booting with Feisty ?
What is the output of smartctl -H /dev/sda ?

Revision history for this message

Josip Lazic (josip) wrote on 2007-10-19:

#52

to ubuntu_demon:

After long testing i came to conclusion that problem is my Sil3112A SATA controller, and after replacing motherboard everything works fine.

Sorry because false alarm.

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-19:

#53

I have tested my SAMSUNG HM080JI harddrive with HUTIL 2.03 (included on Ultimate Boot CD 4.11). All tests passed (including S.M.A.R.T.) except the last test (entire surface scan) which found a couple of ecc errors.

HUTIL suggested to erase the HDD and scan again for errors. So I'm going to do that. If I see errors again I'm going to use my warranty to get a new harddrive. If I don't see errors again I'm going to install Gutsy and use HUTIL regularly to scan for errors. If I see errors again I'm going to use my warranty to get a new harddrive. If I don't see errors again it might still be some strange bug.

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-19:

#54

I'm going to use the HUTIL "Erase HDD" tool which does a low level format.

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-10-19:

#55

HUTIL 2.03 (included on Ultimate Boot CD 4.11) confirms that the harddisk is slowly dying

Changed in linux-source-2.6.22:
status:	New → Invalid

Revision history for this message

Rafiot (raf51) wrote on 2007-10-28:

#56

My english is very poor but I hope that I can help.

I had the same problem but I sawn this warnings too late and after 1 week with Gutsy was my Disk really destroyed... I use Ubuntu since Edgy and hat no Problems. My laptop (acer aspire 1692) had 2 years.

Revision history for this message

viper233 (viper233) wrote on 2007-11-05:

#57

lspci -vv Edit (20.7 KiB, text/plain)

Download full text (3.8 KiB)

Same error has just appeared

Linux Machine-Name 2.6.22-14-generic #1 SMP Sun Oct 14 23:05:12 GMT 2007 i686 GNU/Linux

It went to do a monthly check on may raid 1 device

1 Time(s): [473365.224795] md: data-check of RAID array md0
1 Time(s): [473365.224800] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
1 Time(s): [473365.224802] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
1 Time(s): [473365.224807] md: using 128k window, over a total of 488383936 blocks.
1 Time(s): [473578.174071] ata2.00: exception Emask 0x2 SAct 0xdc3f SErr 0x0 action 0x2 frozen
1 Time(s): [473578.174076] ata2.00: (spurious completions during NCQ issue=0x0 SAct=0xdc3f FIS=004040a1:00000040)
1 Time(s): [473578.174080] ata2.00: cmd 60/80:00:3f:92:e2/00:00:01:00:00/40 tag 0 cdb 0x0 data 65536 in
1 Time(s): [473578.174081] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
1 Time(s): [473578.174084] ata2.00: cmd 60/80:08:bf:40:df/00:00:01:00:00/40 tag 1 cdb 0x0 data 65536 in
1 Time(s): [473578.174085] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
1 Time(s): [473578.174088] ata2.00: cmd 60/80:10:3f:61:e2/00:00:01:00:00/40 tag 2 cdb 0x0 data 65536 in
1 Time(s): [473578.174089] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
1 Time(s): [473578.174092] ata2.00: cmd 60/80:18:bf:e4:df/00:00:01:00:00/40 tag 3 cdb 0x0 data 65536 in
1 Time(s): [473578.174093] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
1 Time(s): [473578.174097] ata2.00: cmd 60/80:20:3f:8f:e2/00:00:01:00:00/40 tag 4 cdb 0x0 data 65536 in
1 Time(s): [473578.174098] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
1 Time(s): [473578.174101] ata2.00: cmd 60/80:28:3f:90:e2/00:00:01:00:00/40 tag 5 cdb 0x0 data 65536 in
1 Time(s): [473578.174102] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
1 Time(s): [473578.174105] ata2.00: cmd 60/80:50:bf:8f:e2/00:00:01:00:00/40 tag 10 cdb 0x0 data 65536 in
1 Time(s): [473578.174106] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
1 Time(s): [473578.174109] ata2.00: cmd 60/80:58:bf:90:e2/00:00:01:00:00/40 tag 11 cdb 0x0 data 65536 in
1 Time(s): [473578.174110] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
1 Time(s): [473578.174113] ata2.00: cmd 60/80:60:3f:91:e2/00:00:01:00:00/40 tag 12 cdb 0x0 data 65536 in
1 Time(s): [473578.174114] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
1 Time(s): [473578.174117] ata2.00: cmd 60/80:70:bf:8e:e2/00:00:01:00:00/40 tag 14 cdb 0x0 data 65536 in
1 Time(s): [473578.174118] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
1 Time(s): [473578.174122] ata2.00: cmd 60/80:78:bf:91:e2/00:00:01:00:00/40 tag 15 cdb 0x0 data 65536 in
1 Time(s): [473578.174123] res 40/00:00:3f:92:e2/00:00:01:00:00/40 Emask 0x2 (HSM violation)
1 Time(s): [473578.481880] ata2: soft resetting port
1 Time(s): [473578.653440] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
1 Time(s): [473578.745461] ata2.00: c...

I also have these messages in the log.

My drive is approx 1 months old. It's connected in RAID5 array with 4 disks with different brands. The disk that reports errors is the first in the array, but the second in the system (/dev/sdb). It's a Seagate Barracuda 320 GB 7.200 RPM SATA/300 disk.

The system is Ubuntu 7.04 server, recently upgraded to 7.10 using apt-get dist-upgrade. I have not had the messages before upgrade to 7.10. On the other hand, I upgraded only few weeks ago in conjunction with installing the RAID system. There are no other operating systems on the same computer.

Here is the output of smartctl:

dan@vodka:~$ sudo smartctl -H /dev/sdb
smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

SMART Health Status: OK

I've had the messages at least twice. Both times the same drive (/dev/sdb). Here are the latest log messages.

Nov 16 19:30:06 vodka kernel: [438824.130744]          res 51/40:00:6a:ba:c5/40:00:16:00:00/e4 Emask 0x9 (media error)
Nov 16 19:30:06 vodka kernel: [438824.181477] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:06 vodka kernel: [438824.264642] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:06 vodka kernel: [438824.264651] ata3.00: configured for UDMA/133
Nov 16 19:30:06 vodka kernel: [438824.264661] ata3: EH complete
Nov 16 19:30:08 vodka kernel: [438826.193215]          res 51/40:00:6a:ba:c5/40:00:16:00:00/e4 Emask 0x9 (media error)
Nov 16 19:30:08 vodka kernel: [438826.252267] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:08 vodka kernel: [438826.327114] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:08 vodka kernel: [438826.327124] ata3.00: configured for UDMA/133
Nov 16 19:30:08 vodka kernel: [438826.327137] ata3: EH complete
Nov 16 19:30:10 vodka kernel: [438828.247367]          res 51/40:00:6a:ba:c5/40:00:16:00:00/e4 Emask 0x9 (media error)
Nov 16 19:30:10 vodka kernel: [438828.306421] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:10 vodka kernel: [438828.381263] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:10 vodka kernel: [438828.381270] ata3.00: configured for UDMA/133
Nov 16 19:30:10 vodka kernel: [438828.381280] ata3: EH complete
Nov 16 19:30:12 vodka kernel: [438830.301525]          res 51/40:00:6a:ba:c5/40:00:16:00:00/e4 Emask 0x9 (media error)
Nov 16 19:30:12 vodka kernel: [438830.343941] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:12 vodka kernel: [438830.418787] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:12 vodka kernel: [438830.418794] ata3.00: configured for UDMA/133
Nov 16 19:30:12 vodka kernel: [438830.418804] ata3: EH complete
Nov 16 19:30:14 vodka kernel: [438832.339049]          res 51/40:00:6a:ba:c5/40:00:16:00:00/e4 Emask 0x9 (media error)
Nov 16 19:30:14 vodka kernel: [438832.381464] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:14 vodka kernel: [438832.447993] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:14 vodka kernel: [438832.448000] ata3.00: configured for UDMA/133
Nov 16 19:30:14 vodka kernel: [438832.448012] ata3: EH complete
Nov 16 19:30:16 vodka kernel: [438834.368253]          res 51/40:00:6a:ba:c5/40:00:16:00:00/e4 Emask 0x9 (media error)
Nov 16 19:30:16 vodka kernel: [438834.427301] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:16 vodka kernel: [438834.510468] ata3.00: ata_hpa_resize 1: sectors = 625142448, hpa_sectors = 625142448
Nov 16 19:30:16 vodka kernel: [438834.510476] ata3.00: configured for UDMA/133
Nov 16 19:30:16 vodka kernel: [438834.510515] sd 2:0:0:0: SCSI error: return code = 0x08000002
Nov 16 19:30:16 vodka kernel: [438834.510519] sdb: Current [descriptor]: sense key: Medium Error
Nov 16 19:30:16 vodka kernel: [438834.510523]     Additional sense: Unrecovered read error - auto reallocate failed
Nov 16 19:30:16 vodka kernel: [438834.510530] Descriptor sense data with sense descriptors (in hex):
Nov 16 19:30:16 vodka kernel: [438834.510533]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 
Nov 16 19:30:16 vodka kernel: [438834.510543]         04 c5 ba 6a 
Nov 16 19:30:16 vodka kernel: [438834.510548] end_request: I/O error, dev sdb, sector 80067178
Nov 16 19:30:16 vodka kernel: [438834.510560] ata3: EH complete
Nov 16 19:30:16 vodka kernel: [438834.551793] SCSI device sdb: 625142448 512-byte hdwr sectors (320073 MB)
Nov 16 19:30:16 vodka kernel: [438834.572691] sdb: Write Protect is off
Nov 16 19:30:16 vodka kernel: [438834.577073] SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Nov 16 19:30:16 vodka kernel: [438834.583702] SCSI device sdb: 625142448 512-byte hdwr sectors (320073 MB)
Nov 16 19:30:16 vodka kernel: [438834.591591] sdb: Write Protect is off
Nov 16 19:30:16 vodka kernel: [438834.606848] SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Nov 16 19:30:16 vodka kernel: [438834.725827] raid5:md0: read error corrected (8 sectors at 80067176 on sdb)

Revision history for this message

ubuntu_demon (ubuntu-demon) wrote on 2007-11-17:

#59

to : hovang and viper233

Your disks are probably starting to die. Start backing up your data.

You should use the tool recommended by your harddisk manufacturer to make sure. If errors are found then do a low-level format using the tool recommended by your harddisk manufacturer. Then check again for errors. If you can still find errors the second time then it's probably time to start shopping for a new harddrive.

Revision history for this message

wangrui (cnsdqdwangrui) wrote on 2008-09-18:

#60

Hi ubuntu_demon ,

I have the similar problem. When I copied one file to another disk, I got an input output error. Then I find whenever I run the following command

sudo dd if=/dev/sdb bs=512 skip=1382533521 count=1 | hexdump -C

I will get input/output error. But I just bought this hard disk for several weeks. And I did a disk scan when bought it. And the hard disk is physically protected well. Does this mean I have to return it? As I tested, only sector 1382533521 to 1382533527 has this problem. Can I just stop using this sectors? Or does this means other sectors will most likely to have the same kind of problem in the near future and I should exchange for another disk?

Thanks

Here is the kernel message,

[33528.071838] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[33528.071846] ata6.00: BMDMA stat 0x24
[33528.071853] ata6.00: cmd 25/00:08:8c:c9:67/00:00:52:00:00/e0 tag 0 dma 4096 in
[33528.071855] res 51/40:00:91:c9:67/40:00:52:00:00/00 Emask 0x9 (media error)
[33528.071859] ata6.00: status: { DRDY ERR }
[33528.071862] ata6.00: error: { UNC }
[33528.182483] ata6.00: configured for UDMA/133
[33528.182500] sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
[33528.182505] sd 5:0:0:0: [sdb] Sense Key : Medium Error [current] [descriptor]
[33528.182511] Descriptor sense data with sense descriptors (in hex):
[33528.182513] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[33528.182523] 52 67 c9 91
[33528.182527] sd 5:0:0:0: [sdb] Add. Sense: Unrecovered read error - auto reallocate failed
[33528.182535] end_request: I/O error, dev sdb, sector 1382533521
[33528.182552] ata6: EH complete
[33528.184128] sd 5:0:0:0: [sdb] 1953525168 512-byte hardware sectors (1000205 MB)
[33528.184303] sd 5:0:0:0: [sdb] Write Protect is off
[33528.184307] sd 5:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[33528.184541] sd 5:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[33528.184751] sd 5:0:0:0: [sdb] 1953525168 512-byte hardware sectors (1000205 MB)
[33528.184915] sd 5:0:0:0: [sdb] Write Protect is off
[33528.184919] sd 5:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[33528.185150] sd 5:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Hi ubuntu_demon ,

I have the similar problem. When I copied one file to another disk, I got an input output error. Then I find whenever I run the following command

sudo dd if=/dev/sdb bs=512 skip=1382533521 count=1 | hexdump -C

I will get input/output error.  But I just bought this hard disk for several weeks. And I did a disk scan when bought it. And the hard disk is physically protected well. Does this mean I have to return it? As I tested, only sector 1382533521 to 1382533527 has this problem. Can I just stop using this sectors? Or does this means other sectors will most likely to have the same kind of problem in the near future and I should exchange for another disk?

Thanks

Here is the kernel message,

[33528.071838] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[33528.071846] ata6.00: BMDMA stat 0x24
[33528.071853] ata6.00: cmd 25/00:08:8c:c9:67/00:00:52:00:00/e0 tag 0 dma 4096 in
[33528.071855]          res 51/40:00:91:c9:67/40:00:52:00:00/00 Emask 0x9 (media error)
[33528.071859] ata6.00: status: { DRDY ERR }
[33528.071862] ata6.00: error: { UNC }
[33528.182483] ata6.00: configured for UDMA/133
[33528.182500] sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
[33528.182505] sd 5:0:0:0: [sdb] Sense Key : Medium Error [current] [descriptor]
[33528.182511] Descriptor sense data with sense descriptors (in hex):
[33528.182513]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 
[33528.182523]         52 67 c9 91 
[33528.182527] sd 5:0:0:0: [sdb] Add. Sense: Unrecovered read error - auto reallocate failed
[33528.182535] end_request: I/O error, dev sdb, sector 1382533521
[33528.182552] ata6: EH complete
[33528.184128] sd 5:0:0:0: [sdb] 1953525168 512-byte hardware sectors (1000205 MB)
[33528.184303] sd 5:0:0:0: [sdb] Write Protect is off
[33528.184307] sd 5:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[33528.184541] sd 5:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[33528.184751] sd 5:0:0:0: [sdb] 1953525168 512-byte hardware sectors (1000205 MB)
[33528.184915] sd 5:0:0:0: [sdb] Write Protect is off
[33528.184919] sd 5:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[33528.185150] sd 5:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Ubuntu
linux-source-2.6.22 package

gutsy kernel is causing data-loss.somehow related to SATA (media-error)

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntulinux-source-2.6.22 package

gutsy kernel is causing data-loss.somehow related to SATA (media-error)

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux-source-2.6.22 package