Comment 40 for bug 430333

Revision history for this message
Tommy Trussell (tommy-trussell) wrote :

I have finally seen a corrupted block after several hours of activity using badblocks. Unfortunately, the corrupted block wasn't in the place I was TRYING to make one. :-(

A comment: it's blasted hard to make a bootable USB or SD card using the ASUS EEEpc alone. Grub apparently has a bug where it writes everything to the SSD regardless of where you specify it, AND /dev/ sometimes populates the removeable devices differently when you have different devices plugged in, or when you have the installer image running, or the phase of the moon, or something. :-P

OK... back to my progress (or lack thereof):

There is a page describing how to make a bootable image on an SD card in Debian 5.0.3 "lenny," so I was able to boot and test two kernels: 2.6.26-2-686, and 2.6.30-bpo.2-686 (2.6.30 backported to Lenny via backports.org). Running badblocks I was able to see the filesystem damage from earlier Ubuntu Karmic installations, HOWEVER, once I "zeroed" out the SSD using dd, the device stayed "clean" through several rounds of badblocks tests. (I did limit myself to five minute runs of the write tests -- in my experience under Karmic installations I felt like I should see the problem by then.)

I was not able to make a grub-bootable SD or USB image to test different Ubuntu kernels, but I have the Ubuntu 9.10 "Karmic" NBR live image, and I created a Kubuntu 9.10 "Karmic" netbook image, too, but of course it seemed pretty similar. I don't tend to see the problem when booting from the SD card. The Karmic kernel I used is 2.6.31-14-generic.

I also used the Ubuntu 9.10 "Karmic" Alternate installer to create a bootable partition on the SSD card. I booted from it, and used badblocks to exercise the other partition I created. Several times, and in several different ways. I NEVER saw badblocks CAUSE any problems on the test partition.

HOWEVER, after all that, the INSTALLED Karmic OS partition developed a bad block at sector 110655. That block is found by badblocks no matter what OS I boot from, though fsck -f does not see it (so it doesn't happen to be one of the nodes fsck looks at). It causes kernel error messages (as described in Bug 445852 ) and parted takes a long time to come up when I try to look at the partition table.

SO I think I can say that JUST writing random patterns using badblocks doesn't make the corruption happen, or at least not quickly. Specifically I used:

# badblocks -sn /dev/sda

After the problem block develops, it's visible using the read-only test:

# badblocks -s /dev/sda

I haven't let badblocks "churn" away on the SSD all day. I would like to come up with something that elicits the filesystem damage almost immediately, like I was seeing with the beta NBR installers. I'm starting to think some other process has to be running at the same time to trigger the corruption.

Maybe it would be enough to read some other place on the SSD at the same time badblocks is reading and writing its random patterns. Maybe another instance of badblocks running in another VT, or something more clever.