ATA exception upon starting smartd, breaking drive until next reboot

Bug #197892 reported by Tristan Schmelcher
6
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

Binary package hint: linux-source-2.6.24

I have an up-to-date i686 gutsy install on a Dell XPS M1710 laptop (except that I have the hardy kernel and a few other hardy packages). I have two SATA drives: one internal and one external (connected via an eSATA-to-ExpressCard adapter). Today I installed smartmontools to monitor my drives and started it with the default configuration. Upon it attempting to inspect my external drive (which was connected but not, I think, mounted), the kernel ATA driver spat out an exception message and automatically unloaded the drive (i.e., it disappeared from /dev). Re-connecting the drive and turning it on and off (which is possible since it was external) did not get it back (the attempts produced no new output from dmesg at all). However, after a reboot it was back and working normally.

I have attached my syslog from when the problem occurred. The output for it starts at "Problem creating device name scan list".

Also, this may not be relevant, but after re-connecting/restarting the drive failed to fix things (but before rebooting), I tried to suspend (twice) for an unrelated reason, but that failed because the ATA driver couldn't shut down the external drive. This means that the driver was at least still aware of that drive. The output for that can also be seen in my syslog.

I'll be the first to admit that this is an obscure bug and probably also upstream, and to make matters worse I have been unable to reproduce it.

Note that smartd does not work with my drives in the default configuration (-d sat is required), but what I am reporting is specifically the fact that the kernel got into a broken state.

Tags: cft-2.6.27
Revision history for this message
Tristan Schmelcher (tschmelcher) wrote :
Revision history for this message
Tristan Schmelcher (tschmelcher) wrote :
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi Tristan,

It appears you are running the 2.6.24-8 kernel. Care to try the latest 2.6.24-11 kernel? If the issue still exists, per the kernel team's bug policy, can you please attach the following information in addition to what you/ve already provided. Please be sure to attach each file as a separate attachment.

* cat /proc/version_signature > version.log
* dmesg > dmesg.log

For more information regarding the kernel team bug policy, please refer to https://wiki.ubuntu.com/KernelTeamBugPolicies . Thanks again and we appreciate your help and feedback.

Changed in linux:
status: New → Incomplete
Revision history for this message
Tristan Schmelcher (tschmelcher) wrote :

Thanks, I will update to 2.6.24-11. However, since I haven't seen the bug a second time on even 2.6.24-8, I wouldn't hold your breath.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

Revision history for this message
Ketil Malde (ketil-ii) wrote :

I'm just going to chime in, since it sounds similar to what I'm experiencing.

Occasionally - perhaps every two to four days, but irregularly - the ATA driver seems to lock up. The system stays up and running, but any command that has to be loaded from disk will give an error message (I/O error), dmesg reports tons of disk errors, and my termperature monitor fails to report the disk temperature.

Regarding the last item, it's a script that runs 'hddtemp' regularly. Come to think of it, these problems started to occur when I started to track disk temperature.

Hardy with 2.6.24-19, happened with at least the previous kernel, too. Hardware is a Dell D620.

Revision history for this message
Ketil Malde (ketil-ii) wrote :

Sorry, that should be 8.04 Gutsy, of course.

Revision history for this message
Ketil Malde (ketil-ii) wrote :

Just noting that since I disabled the hddtemp command one week ago, my system has worked flawlessly.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

*This is an automated response*

This bug report is being closed because we received no response to the previous request for information. Please reopen this if it is still an issue in the actively developed pre-release of Jaunty Jackalope 9.04 - http://cdimage.ubuntu.com/releases/jaunty . To reopen the bug report simply change the Status of the "linux" task back to "New".

Changed in linux:
status: Incomplete → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.