Ubuntu

hard disk corruption

Reported by Shirish Agarwal on 2008-12-21
10
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Unassigned

Bug Description

Get hard disk corruption randomly on 2.6.27-10 as well 2.6.27-11 . Generally after doing

$sudo shutdown -r now

ProblemType: Bug
Architecture: i386
DistroRelease: Ubuntu 8.10
LsUsb:
 Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 003 Device 002: ID 051d:0002 American Power Conversion Uninterruptible Power Supply
 Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Package: linux-image-2.6.27-11-generic 2.6.27-11.22
ProcCmdLine: root=UUID=dcf827c4-c7cb-4d66-9cb9-7baf82a69f3c ro
ProcEnviron:
 SHELL=/bin/bash
 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
 LANG=en_IN
ProcVersionSignature: Ubuntu 2.6.27-11.22-generic
SourcePackage: linux

Shirish Agarwal (shirishag75) wrote :
Shirish Agarwal (shirishag75) wrote :

The only workaround is to do fsck each time. Here's the output from fsck I had to do this time around through the Live CD

Shirish Agarwal (shirishag75) wrote :

Here's the output of dumpe2fs on the fs.

The fs on question is /dev/sdb7 and/or /home

Shirish Agarwal (shirishag75) wrote :

This is the /etc/fstab as its now.

cat /etc/fstab
# /etc/fstab: static file system information.
#
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0
# /dev/sda5
UUID=dcf827c4-c7cb-4d66-9cb9-7baf82a69f3c / ext3
relatime,errors=remount-ro 0 1
# /dev/sda1
UUID=49b2cc28-fa6d-4d0d-ae16-040861151df0 /boot ext3
relatime 0 2
# /dev/sda7
UUID=20168ab6-abce-4804-840f-51a2036c0a5e /home ext3
relatime 0 2
# /dev/sda6
UUID=d69559ab-c9f5-40fd-b048-8f0a98a2a977 none swap sw
          0 0
/dev/scd0 /media/cdrom0 udf,iso9660 user,noauto,exec,utf8 0 0

none /proc/bus/usb usbfs defaults 0 0

The only change made herein in is the usb stuff. Other than that
whatever should be standard. Do not know if the errors thing should be
there or not.

The other thing I dunno is why its showing /dev/sda when it should be
showing /dev/sdb as its in Primary Slave configuration.

Shirish Agarwal (shirishag75) wrote :

This is the output is fdisk

$ sudo fdisk -l
[sudo] password for shirish:

Disk /dev/sda: 80.0 GB, 80060424192 bytes
255 heads, 63 sectors/track, 9733 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0ded0d24

  Device Boot Start End Blocks Id System
/dev/sda1 * 1 250 2008093+ 83 Linux
/dev/sda2 251 2682 19535040 83 Linux
/dev/sda3 2683 9551 55175242+ 83 Linux
/dev/sda4 9552 9733 1461915 82 Linux swap / Solaris

Disk /dev/sdb: 160.0 GB, 160041885696 bytes
255 heads, 63 sectors/track, 19457 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x48254824

  Device Boot Start End Blocks Id System
/dev/sdb1 * 1 18 144553+ 83 Linux
/dev/sdb2 19 19457 156143767+ 5 Extended
/dev/sdb5 19 2450 19535008+ 83 Linux
/dev/sdb6 2451 2574 995998+ 82 Linux swap / Solaris
/dev/sdb7 2575 19457 135612666 83 Linux

Shirish Agarwal (shirishag75) wrote :

This is the output of grep -i "jbd\|ext3" /var/log/* as an attachment

Shirish Agarwal (shirishag75) wrote :

Please lemme know if anything is needed on the same.

Shirish Agarwal (shirishag75) wrote :

I tested RAM using memtest 86+ , ran two passes , it took about 1:30
hours doesn't show anything. If want can let it run whole night.

Many a times I do get this though, maybe the hdd is failing ?

[284.072041] ata2.00:exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[284.072119] ata2.00:cmd a0/00:00:00:00:00/00:00:00:00/a0 tag0
[284.072122] cdb 1b 00 00 02 00 00 00 00 00 00 00 00
[284.072124] res 40/00:03:00:00:00/00:00:00:00/a0 Emask 0x4 (timeout)
[284.072300] ata2.00: status: { DRDY }

This is from the Live CD, otherwise its ata1:10 most of the time when
booting off the kernel as well as the status has many more things, not
just DRDY .

Comments, suggestions all welcome.

Changed in linux:
importance: Undecided → High
status: New → Triaged
Shirish Agarwal (shirishag75) wrote :

Also did couple of informationals as well as benchmarks tests (making sure that its not the hdd which is failing) . First few informational ones and then the small offline tests.

The attached file is the output of smartctl -A /dev/sdb

Shirish Agarwal (shirishag75) wrote :

This one is output of smartctl -i /dev/sdb

Shirish Agarwal (shirishag75) wrote :

This is the output of smartctl -c /dev/sdb

Shirish Agarwal (shirishag75) wrote :

This is the output of smartctl -l selftest /dev/sdb

Shirish Agarwal (shirishag75) wrote :

Last but not the least

Output of smartctl -r ioctl -i /dev/sdb

Shirish Agarwal (shirishag75) wrote :

Something I should have started with

My HDD is an IDE HDD, not SATA.

Shirish Agarwal (shirishag75) wrote :

This is the output of dmidecode

Shirish Agarwal (shirishag75) wrote :

This is the output of lshal -t

Shirish Agarwal (shirishag75) wrote :

I reformatted and its happening again, this time on 2.6.27-9-generic as well.

It says

ata 1.01 revalidation failed (error=2)

or sometimes

ata 1.01 soft revalidation failed (error=2)

or words to that effect.

Doing fsck -y /dev/sda1/5/6 which has /boot, / and /home respectively give all partitions as clean so don't know what else needs to be done?

This bug report was marked as Triaged a while ago but has not had any updated comments for quite some time. Please let us know if this issue remains in the current Ubuntu release, http://www.ubuntu.com/getubuntu/download . If the issue remains, click on the current status under the Status column and change the status back to "New". Thanks.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-triage
Changed in linux (Ubuntu):
status: Triaged → Incomplete
Andres Mujica (andres.mujica) wrote :

Adding all_generic_ide to your kernel boot line makes a difference?

https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/206635/comments/18

Seems to me that your problem is similar (even the same) as bug #206635 or bug #153702

tags: added: needs-upstream-testing
Przemysław Kulczycki (azrael) wrote :

Hi Shirish.
Is this bug still an issue for you?
Can you reproduce it on Ubuntu 10.04 beta?

Jeremy Foshee (jeremyfoshee) wrote :

This bug report was marked as Incomplete and has not had any updated comments for quite some time. As a result this bug is being closed. Please reopen if this is still an issue in the current Ubuntu release http://www.ubuntu.com/getubuntu/download . Also, please be sure to provide any requested information that may have been missing. To reopen the bug, click on the current status under the Status column and change the status back to "New". Thanks.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-expired
Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers