smartmontools should be installed on a default system

Bug #391576 reported by Jarl on 2009-06-24
22
This bug affects 5 people
Affects Status Importance Assigned to Milestone
openSUSE
Fix Released
Wishlist
ubuntu-meta (Ubuntu)
Wishlist
Unassigned
Nominated for Lucid by Jarl

Bug Description

This is a wish:

On a competing proprietary OS, a default installation exploits S.M.A.R.T capabilities of disks to ensure stable disk-usage. I think to fix bug 1, it is necessary to have smartmontools to be installed and activated on default installations (both desktop/laptop and server variants).

Before I switched to ubuntu I succeeding in convincing the SuSE people to have that package installed and activated on a default system, see https://bugzilla.novell.com:443/show_bug.cgi?id=201715, further see bug 103681 regarding default activation.

Stanislav, what do you think about this?

Yes, it would be nice. Running it by default (and configuring it to make regular long surface checks by default) could prevent data loss on nearly correct discs - we "lost" several discs in this way in last weeks - the disc has enough spare sectors and S.M.A.R.T. can remap failing sectors, but it is not able to do it, because they are discovered too late and it is not possible to read weak data before remapping.

The current SuSE implementation does not turn it on by default and contains YaST install notify message:

---
To prevent system hangs from buggy devices, smartd is turned off by default.
Please test smartd manually first and then turn it on via
the Runlevel Editor or by /sbin/chkconfig --add smartd.
---

Most people don't see this message.

I personally never seen such device, but smartmontools developers say, that such devices exist. And we need a way, how to detect, that the current hardware is not broken - e. g. write /foo/broken before running smartd and delete it after successfull boot and don't run smartd if smartd is present (and not old).

For desktops we have to use desktop-neutral user notification (permanent or regular displaying of notification message, e. g. "Your disc is going to crash soon!"), otherwise fatal messages will be lost in the syslog.

Any positive experiences with sata disks?

Yes. Works with SATA for a long time - see README.SATA.
Next version (not yet in 10.1) will be able to auto-detect SATA without any arguments.

Reply from Bruce Allen:

> We are thinking about turning smartd on by default for SuSE Linux 10.2.
> In one of older discussions you wrote that there are broken devices,
> which hang on any attempt to access S.M.A.R.T. Do you have more info
> about it? How common such devices are. Is there any list of them? It
> seems that all problems in WARNINGS are kernel-related, not hardware
> related.

I think that the issues were primarily with USB devices. I think that
they have been fixed some time ago in the kernel.

---

Note that current version still does not correctly auto-detect SATA, so the installation process needs some hacking to be able to pre-configure it.

Actual summary:

- Enable smartd by default probably will not cause any hangs and crashes with current kernel.

- Enable smartd with default configuration will not work with SATA discs. Either we have to write some installation magic or improve smartd to detect properly SATA over libata.

We should get it working with SATA disks. Can we write some installation magic - perhaps in YaST?

Yes, we can write installation magic. But maybe better would be hack the smartd code to detect correctly SATA during DEVICESCAN. This is planned by the author anyway.

Bruce Allen wrote:

And you are right that DEVICESCAN is broken. If you use

DEVICESCAN

then libata/SATA devices are not registered/found. And if you use

DEVICESCAN -d ata

then SCSI devices (including libata/SATA) are not registered/found.

The only short term solution I can think of is to use DEVICESCAN -d sat
which should work. But this requires using a smartmontools CVS snapshot:
it won't work with the 5.36 release. And it will fail to find normal SCSI
devices.

(Note: In the longer term, this will get fixed as part of Christian's new
C++ restructuring.)

Today CVS snapshot finally supports DEVICESCAN on SATA devices with the default config file. I am going to update and turn smartd on.

Is it OK for you?

Done. Implemented following logic:

If SuSE version < 10.1 don't start it by default.
If it is an update from "don't start by default" version, turn it on.
If it is update from "start by default" version, respect previous settings.

Logic does not work properly and it's not turned on by default in 10.2. It's too late to fix it now. Retargetting for 10.3.

(In reply to comment #13 from Stanislav Brabec)
> Logic does not work properly and it's not turned on by default in 10.2. It's
> too late to fix it now. Retargetting for 10.3.

OK, so what is the status of this one in 10.3, is it turned on by default in 10.3?

Fixed for 10.3. I hope that it will now start for real. Please test in the next snapshot (version 5.37) and reopen, if it is still not started by default.

(In reply to comment #15 from Stanislav Brabec)
> Fixed for 10.3. I hope that it will now start for real. Please test in the next
> snapshot (version 5.37) and reopen, if it is still not started by default.
>

I will try to find time for this soon. BTW: What software does the version number 5.37 belong to?

smartmontools-5.37. I have fixed this problem altogether with version update.

Also available in the OBS: http://download.opensuse.org/repositories/home:/sbrabec/

I have now installed official OpenSuSE 10.3 and made a default installation.

However the default configuration file reveals the following issues on my system:
Oct 4 23:05:00 hermes smartd[10533]: Problem creating device name scan list
Oct 4 23:05:00 hermes smartd[10533]: Device: /dev/sda, opened
Oct 4 23:05:00 hermes smartd[10533]: Device /dev/sda: ATA disk detected behind SAT layer
Oct 4 23:05:00 hermes smartd[10533]: Try adding '-d sat' to the device line in the smartd.conf file.
Oct 4 23:05:00 hermes smartd[10533]: For example: '/dev/sda -a -d sat'
Oct 4 23:05:00 hermes smartd[10533]: Device: /dev/sda, opened
Oct 4 23:05:00 hermes smartd[10533]: Device: /dev/sda, not found in smartd data base.
Oct 4 23:05:01 hermes smartd[10533]: Device: /dev/sda, is SMART capable. Adding to "monitor" list.
Oct 4 23:05:01 hermes smartd[10533]: Monitoring 1 ATA and 0 SCSI devices
Oct 4 23:05:01 hermes smartd[10598]: smartd has fork()ed into background mode. New PID=10598.

I don't know if that should be considered a problem at all, it seems like smartd is monitoring the disk anyway. At least it reports Usage Attribute changes from time to time.

The disk, by the way, is not a SATA disk, it is a a WD2500BEVE, EIDE, ATA-100 interface.

Jarl

It seems like the information mentioned in comment #18 is not critical. I consider this bug as fixed, hence verified.

Jarl (jarl-dk) wrote :

This is a wish:

On a competing proprietary OS, a default installation exploits S.M.A.R.T capabilities of disks to ensure stable disk-usage. I think to fix bug 1, it is necessary to have smartmontools to be installed and activated on default installations (both desktop/laptop and server variants).

Before I switched to ubuntu I succeeding in convincing the SuSE people to have that package installed and activated on a default system, see https://bugzilla.novell.com:443/show_bug.cgi?id=201715, further see bug 103681 regarding default activation.

summary: - smartmontools should be installed on a default system
+ [Wishlist] smartmontools should be installed on a default system
summary: - [Wishlist] smartmontools should be installed on a default system
+ smartmontools should be installed on a default system
affects: ubuntu → ubuntu-meta (Ubuntu)
Changed in ubuntu-meta (Ubuntu):
importance: Undecided → Wishlist
Changed in opensuse:
status: Unknown → Fix Released
Changed in opensuse:
importance: Unknown → Wishlist
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ubuntu-meta (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.