Samsung NC10 fails suspend/Resume tests

Bug #340014 reported by Manoj Iyer on 2009-03-09
144
This bug affects 16 people
Affects Status Importance Assigned to Milestone
Linux
Won't Fix
Medium
acpi-support (Ubuntu)
Undecided
Unassigned
Nominated for Lucid by Gerry C.
Jaunty
Undecided
Unassigned
Karmic
Undecided
Unassigned
linux (Ubuntu)
High
Manoj Iyer
Nominated for Lucid by Gerry C.
Jaunty
Medium
Manoj Iyer
Karmic
High
Manoj Iyer
pm-utils (Ubuntu)
High
Steve Langasek
Nominated for Lucid by Gerry C.
Jaunty
Undecided
Unassigned
Karmic
High
Steve Langasek

Bug Description

Suspend resume tests on Samsung NC10 fails. The testst that involve attaching & removing the power cord causes the failure. dmesg reports ata1.0: link is slow to respond, please be patient ata1: device is not ready. and the application that are run after the suspend resume failure spits out I/O errors. I have attached the photos of the screen & log files.

Manoj Iyer (manjo) wrote :
Manoj Iyer (manjo) wrote :

Log files.

Changed in linux (Ubuntu Jaunty):
importance: Undecided → Medium
status: New → Triaged
tags: added: jaunty resume suspend
Matt Zimmerman (mdz) wrote :

Assigning to Manoj per pgraner

Changed in linux (Ubuntu Jaunty):
assignee: nobody → manjo
Teemu Kiviniemi (teemuki) wrote :

This seems to be the same problem as described in bug #344183.

Manoj Iyer (manjo) wrote :

attaching dmesg output & debugging problem, for reference.

Manoj Iyer (manjo) wrote :

I am seeing these errors on resume when there is a transition of power state from "AC unplugged" to "AC plugged in" or vice versa. ata_check_ready() reads the status register which contains the value 0xd0 when the ATA reset error occurs, 0xd0 is not a valid value, and SRST fails and eventually the error handling routine give up trying to reset the device. There are some upstream patches to libata but those dont fix this problem.

Similar problems have been reported on other Laptops and the suggestion that worked frequently is to change the BIOS setting for SATA from extendend mode to compatible mode. But the current bios version does not provide any options to change SATA settings.

The BIOS does have some hidden settings, one of which enables setting the SATA mode. The problem is that the default (at least in BIOS version 04CA) already is "Compatible". To show the undocumented settings, enter the CMOS setup (by hitting F2 during boot). Then, press Fn+F11 followed by Fn+F12. Use a cursor key to force a screen redraw. An "Intel" menu entry will appear. The SATA settings are located at "Intel" -> "ICH Control Sub-Menu" -> "Integrated Device Control Sub-Menu" -> "SATA - Device 31, Function 2".

Manoj Iyer (manjo) wrote :

Similar discussion on lkml last year, but the patch mentioned in this thread is already present in this kernel. Similar related problem ?

http://kerneltrap.org/mailarchive/linux-kernel/2008/5/7/1754774

I think that the fact that this occurs during the suspend/resume tests is a mere coincidence. I am also seeing these ATA timeouts during regular system boots (with a similar probability).

Ok, after some further testing I set the SATA mode on my NC10 to "Enhanced" using the hidden BIOS menu described above. The error seems to be gone, i.e., I have neither been able to reproduce it during suspend/resume or boot nor using the instructions from https://bugs.launchpad.net/ubuntu/+source/linux/+bug/344183/comments/11.

Sounds like a BIOS bug that probably needs to be worked around...

Manoj Iyer (manjo) wrote :
Download full text (11.6 KiB)

I had email communication with Tejun Heo, and here is the chain of thoughts:

>> I am looking at a launchpad bug
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/340014
>>
>> What I am doing is:
>>
>> 1. suspend/resume with AC plugged in
>> 2. suspend/resume with AC unplugged
>> 3. suspend/resume with AC plugged in
>> 4. suspend plugged in, remove AC power, resume
>> 5. suspend with AC removed, plug in AC power, resume
>>
>> Usually after 3 (or 4) I get:
>>
>> ata1: link is slow to respond, please be patient (ready=0)
>> ata1: SRST failed (errno=-16)
>> ata1: soft resetting link
>>
>> ata1: reset failed, giving up
>> ata1.00: disabled
>> ata1: EH complete
>> sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET
>> driverbyte=DRIVER_OK,SUGGEST_OK
>> end_request: I/O error, dev sda, sector 152855535
>> Aborting journal on device sda1.
>>
>> After some long hours of debugging I find that when the above happens
>> the ata taskfile status register ap->ioaddr.status_addr is set to
>> 0xd0. return...

yurukov (yurukov) wrote :

I'm getting the same errors with Easy Peasy 1.0 which is based on 8.10, so that bug is nothing new. Sometimes the wireless works and sometimes it doesn't after resume. I need to restart the system to get it back on. Very annoying!

Nico Isenbeck (nico-isenbeck) wrote :

The mentioned BIOS settings do not exist any longer in BIOS version 07CA

@Nico: It won't help anyway. That it initially did for me was most likely a coincidence.

How do we proceed? Has this issue been reported upstream already? If not, should I do so?

Nico Isenbeck (nico-isenbeck) wrote :

Please do so, this issue is really annoying. Would be great to get it solved...

Done: http://bugzilla.kernel.org/show_bug.cgi?id=13416

Unfortunately, I can't seem to figure out how to add a bug watch for this URL. Maybe, somebody, who knows how to do this could take care of this. Thanks!

Added the bug watch :)

Changed in linux:
status: Unknown → In Progress
Luke Whitmore (lwhitmore) wrote :

Hi

I'm still getting this error on a regular basis -> especially when I plug or unplug the power (and the power states change). It generally leads to a system freeze and I loose work.

I'm currently running Jaunty with 2.6.30-020630-generic.

Is there anything I could do to prevent this error from occurring while a fix is being developed?

Lex Berger (lexberger) wrote :

I have not experienced the issue since I use the NC10-optimized kernel provided here:

http://www.voria.org/forum/showthread.php?tid=41

Fortunato Ventre (voria) wrote :

The kernel on my NC10 repository is not optimized in any way, it's just a backport to jaunty of the latest kernel available for karmic.

Kev (ukev) wrote :

Hi, just found this bug report here after having these issues since a half year.. It's a really annoying bug.

(nc10, jaunty 2.6.28-15-generic #49)

Is there any progress on this? Last comment is one month ago.

If you are interested in the current progress, feel free to follow the Kernel bug report, which is referenced from this one.

After a lengthy analysis via the referenced Linux Kernel bug report (thanks, Tejun!) it turned out that the HDD most commonly built into the NC10, a SAMSUNG HM160HI, responds kind of picky when handling the APM feature of the ATAPI SET FEATURE command (typically invoked by hdparm -B) and most likely gets stuck if it receives the command while some load is placed on the system or it receives the same comand multiple times in a row.

There are two candidates, which do this after resuming and/or changing the battery state: laptop-mode and acpi-support.

While laptop-mode offers a switch (CONTROL_HD_POWERMGMT) in /etc/laptop-mode/laptop-mode.conf to turn off this behavior, acpi-support offers no way of turning off this behavior (apart from turning on CONTROL_HD_POWERMGMT for laptop-mode). If you'd like to debug which commands are sent to the HDD, there is a debugging patch available, which allows printing out all non-data related commands sent to the HDD: http://bugzilla.kernel.org/show_bug.cgi?id=13416#c51

Consequently, I am switching over this report to the acpi-support package.

As a quick workaround, get rid of all 90-hdparm.sh files below /etc/acpi/ and make sure that CONTROL_HD_POWERMGMT is set to 0 in /etc/laptop-mode/laptop-mode.conf.

Changed in linux:
status: In Progress → Invalid
Matt Zimmerman (mdz) on 2009-10-05
Changed in linux (Ubuntu Karmic):
importance: Medium → High
milestone: none → ubuntu-9.10
Manoj Iyer (manjo) wrote :

dmidecode & lspci infomation.

Manoj Iyer (manjo) wrote :
Steve Langasek (vorlon) wrote :

acpi-support has been fixed in karmic to not call hdparm -B further; the remaining calls to hdparm (on power event and resume) are in pm-utils.

affects: acpi-support (Ubuntu Karmic) → pm-utils (Ubuntu Karmic)
Manoj Iyer (manjo) wrote :

steve, thanks a ton, is there a ppa that I can install or is it in the daily-build already? I can test this if you point me in the right direction

Steve Langasek (vorlon) wrote :

The changes to acpi-support are already in the archive, but are not actually related to the resume hangs at all - I was merely saying that pm-utils is the package that's calling hdparm on resume, so that's what needs to be fixed.

As I mentioned on IRC, there is a fix to a /related/ bug in the archive now for pm-utils: we were unnecessarily calling hdparm -B twice on resume, we've now cut this down to one call. So this should improve resume times in general for folks, and *may* be enough to let the NC10 avoid hanging; but if not, we can cook up a quirk mechanism based on the above DMI information.

Changed in pm-utils (Ubuntu Karmic):
assignee: nobody → Steve Langasek (vorlon)
importance: Undecided → High
milestone: none → ubuntu-9.10
status: New → Triaged
Manoj Iyer (manjo) wrote :

Latest update of apci-support package from the archive + "rm /etc/pm/sleep.d/99laptop-mode"

With the acpi work around I was able to suspend and resume, removing power cord before/after suspend/resume over 30 times with no errors. I believe this an acceptable threshold. Marking this bug as fixed for Karmic.

Changed in linux (Ubuntu Karmic):
status: Triaged → Fix Released
Manoj Iyer (manjo) wrote :

Latest update of apci-support package from the archive + "rm /etc/pm/sleep.d/99laptop-mode"

With the acpi work around I was able to suspend and resume, removing power cord before/after suspend/resume over 30 times with no errors. I believe this an acceptable threshold. Marking this bug as fixed for Karmic.

Changed in pm-utils (Ubuntu Karmic):
status: Triaged → Fix Released
Manoj Iyer (manjo) wrote :

changes are actually in pm-utils, linux part is invalid, so changing this to invalid from fix-released.

Changed in linux (Ubuntu Karmic):
status: Fix Released → Invalid
Steve Langasek (vorlon) on 2009-10-07
Changed in linux (Ubuntu Jaunty):
status: Triaged → Invalid
Steve Langasek (vorlon) wrote :

Note that the relevant fixes were to the pm-utils and laptop-mode-tools packages, not acpi-support; acpi-support had a gratuitous call to hdparm but only in a path that was unrelated to this bug.

Fixing pm-utils and laptop-mode-tools cut the number of back-to-back 'hdparm -B' calls on resume from 3 to 1. The power transition event from unplugging the power while suspended would add an extra 1 call (via gnome-power-manager+devkit-power), so this is a max 2 'hdparm -B' calls vs. 4 before.

Since the problem was only being reported in the case where hdparm -B was being called 4 times back-to-back, we are probably now safely below the threshold where we'll trigger this; but a quirk blacklist *can* be implemented, if this problem recurs.

Unfortunately, the issue still pops up after upgrading to Karmic, so the fix did not solve the problem. Could we try the blacklisting approach?

Changed in pm-utils (Ubuntu Karmic):
status: Fix Released → New
Steve Langasek (vorlon) wrote :

Thilo-Alexander,

Do you have laptop-mode enabled on this system, or anything else that could generate additional calls to hdparm? In Manoj's tests, suspend/resume worked reliably after the changes tha have been made.

Nico Isenbeck (nico-isenbeck) wrote :

Hi

For me the bug seems to be fixed after upgrading to karmic. I tried to suspend/resume about 15 times without any errors.

The above mentioned file /etc/pm/sleep.d/99laptop-mode doesn't exist on my system.
Relevant settings in /etc/laptop-mode/laptop-mode.conf:
CONTROL_HD_POWERMGMT=1
ENABLE_LAPTOP_MODE_ON_BATTERY=1

I did those tests on battery... Steve, would you recommend to disable laptop-mode completely to avoid problems?

Changed in pm-utils (Ubuntu Karmic):
status: New → In Progress

I had CONTROL_HD_POWERMGMT set to 0, so my assumption was that laptop-mode would not issue any APM calls to my HDD. I will patch my current kernel again so that APM calls are traced again, so I can better quantify the calls that are made.

(BTW, sorry for the duplicate status change, my Internet connection was kind of unstable when I did the changes.)

Steve Langasek (vorlon) wrote :

In addition to CONTROL_HD_POWERMGMT, there are two other hdparm settings that may be set: CONTROL_HD_IDLE_TIMEOUT and CONTROL_HD_WRITECACHE, and of these, CONTROL_HD_IDLE_TIMEOUT is also set by default. I don't think this problem is limited only to APM calls, but that any hdparm calls contribute to triggering it.

You shouldn't need to patch your kernel to get a trace of apm calls, though; a simple wrapper script should do the job:

$ sudo dpkg-divert --add --local --rename /sbin/hdparm
$ cat | sudo tee /sbin/hdparm
#!/bin/sh

echo "called with args: $@" >> /var/log/hdparm.log
exec /sbin/hdparm.distrib "$@"
^D
$ sudo chmod a+x /sbin/hdparm
$

This should give you a log in /var/log/hdparm.log showing every invocation of hdparm on the system.

Steve Langasek (vorlon) wrote :

Nico,

I don't think you need to disable laptop-mode completely, but I think you probably want to set CONTROL_HD_POWERMGMT=0 to let pm-utils take care of this.

Unfortunately, even single hdparm invocations seem to trigger the issue:

called with args: -B 254 /dev/sda
init-+-NetworkManager-+-dhclient
[...]
     |-hald---hald-runner-+-hal-system-powe---hal-system-powe---pm-powersave---95hdparm-apm---hdparm---pstree

[ 1625.989172] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 1625.989213] ata1.00: cmd c8/00:40:67:f4:58/00:00:00:00:00/e2 tag 0 dma 32768 in
[ 1625.989219] res 40/00:fe:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
[ 1625.989234] ata1.00: status: { DRDY }
[ 1631.028084] ata1: link is slow to respond, please be patient (ready=0)
[ 1636.012086] ata1: device not ready (errno=-16), forcing hardreset
[ 1636.012109] ata1: soft resetting link
[ 1636.194513] ata1.00: configured for UDMA/133
[ 1636.194536] ata1.00: device reported invalid CHS sector 0
[ 1636.194569] ata1: EH complete

The above error was triggered by re-connecting the NC10's power adapter.

BTW, the BIOS is already the most recent version (11CA; as somebody suggested a BIOS update via mail).

Steve Langasek (vorlon) wrote :

Just to confirm, could you please add the option 'nohdparm' to your kernel boot line and see whether this problem is still reproducible then?

I can confirm that I see no more hdparm invocations after specifying "nohdparm". Proving that the bug now no longer appears under any circumstances is kind of tricky for a sporadic problem, but I tried dozens of power adapter state changes and a couple of suspend/resume cycles and have not seen the issue surfacing again.

even with nohdparm in grub the bug STILL remains.
Linux 2.6.31-9-rt #152-Ubuntu SMP PREEMPT RT Thu Oct 15 05:01:14 UTC 2009 i686 GNU/Linux.
ASUS A8J with ST9100824A Ultra ATA/100 100GB (seagate).

On Tue, Dec 29, 2009 at 09:15:52PM -0000, giorgio_fornara wrote:
> even with nohdparm in grub the bug STILL remains.

Then you must have a different bug.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

papukaija (papukaija) on 2010-01-02
tags: added: ubuntu-unr
papukaija (papukaija) on 2010-01-09
tags: added: karmic
Changed in pm-utils (Ubuntu Jaunty):
assignee: nobody → Steve Langasek (vorlon)
Steve Langasek (vorlon) wrote :

don't assign bugs to other people.

Changed in pm-utils (Ubuntu Jaunty):
assignee: Steve Langasek (vorlon) → nobody

Has this been fixed? I'm running a relatively fresh install of 10.04 with the latest Kernal on my NC10 (with most recent BIOS) and am still having trouble suspending and resuming.

The only setup that works for me is when I suspend from the system menu with my AC cord plugged in. When I select this option, the screen saver comes on for one or two seconds and then the computer goes to sleep. When it wakes back up, the screen saver is on for a moment and then I get the "screen unlock" screen where I have to enter my password.

I've also tried without the AC cord, by pressing the "Sleep key" (Fn + F1) and by closing the lid, none of which work. When I awaken the computer (using the power button) I get only a blank screen and cannot use any controls. I'm forced to shutdown by holding down the power key. Interestingly, suspending using these settings (no AC cord, or by pressing the sleep key, or by closing the lid) don't turn on the screen saver for a split second either. It seems that when the screen saver turns on during the suspend process, then I'll be able to wake back up.

Does anyone have any ideas where I should go from here? I'm getting the idea that everyone else has worked out a solution to this problem.

Steve Langasek (vorlon) wrote :

On Tue, May 25, 2010 at 07:28:25PM -0000, Alexander Johnson wrote:
> Does anyone have any ideas where I should go from here? I'm getting the
> idea that everyone else has worked out a solution to this problem.

I recently learned that upower is calling pm-powersave, *in addition* to
acpi-support calling it. This would appear to explain the reappearance of
the problem, after it was thought to be fixed in karmic before release.

Could you try commenting out the 'pm-powersave' line in /etc/acpi/power.sh,
and see if this fixes your suspend/resume problems?

If it doesn't, you're probably experiencing a different bug.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Changed in pm-utils (Ubuntu Jaunty):
status: New → Won't Fix
Changed in pm-utils (Ubuntu Karmic):
status: In Progress → Fix Released
Steve Langasek (vorlon) on 2010-05-25
Changed in acpi-support (Ubuntu Karmic):
status: New → Invalid
Changed in acpi-support (Ubuntu Jaunty):
status: New → Invalid

I've tried to edit this file, but I cannot. It's read-only. I changed to root via terminal (sudo bash) but still can't edit the file. Once I can, I comment out with #, correct?

Steve Langasek (vorlon) wrote :

Well, the acpi-support issue is already tracked as bug #582471, so closing the acpi-support tasks here that I just opened.

Thilo-Alexander, can you test whether this problem is still present for you in Lucid when booting without nohdparm, and if so, can you try disabling /etc/acpi/power.sh as described to see if that resolves the problem for you?

Steve Langasek (vorlon) wrote :

On Tue, May 25, 2010 at 10:15:04PM -0000, Alexander Johnson wrote:
> I've tried to edit this file, but I cannot. It's read-only. I changed to
> root via terminal (sudo bash) but still can't edit the file.

With sudo bash, you should be able to edit it.

> Once I can, I comment out with #, correct?

Yes.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

OK, thanks for your help, Steve. I've commented it out, rebooted, and am still experiencing the same problems.

Could you describe how to "boot without nohdparm" and how to disable /etc/acpi/power.sh?

Steve Langasek (vorlon) wrote :

On Tue, May 25, 2010 at 10:36:42PM -0000, Alexander Johnson wrote:
> OK, thanks for your help, Steve. I've commented it out, rebooted, and am
> still experiencing the same problems.

> Could you describe how to "boot without nohdparm"

If you're using Grub2 (which is the default when installing Ubuntu 9.10 or
later), information about changing the default boot options can be found
here: https://help.ubuntu.com/community/Grub2#Configuring%20GRUB%202

> and how to disable /etc/acpi/power.sh?

If you've commented out the pm-powersave call, then it *is* disabled, as
that's the only thing the script does.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Sorry to be such a bother about this, but I'm still having trouble. I've read through that guide, but couldn't find anything related to nohdparm. I found a bunch of files on my computer which included the text "hdparm," but I still have no idea what it does or how to boot without it. Is there any chance you can explain in brief how to do this? Or point me to someone who could? And just to be clear, I'm trying to boot *without* *nohdparm*, correct? That double negative (without...no...) is mixing me up. Thanks in advance if you have the time/energy to continue helping me out with this.

Steve Langasek (vorlon) wrote :

The aim is for you to boot *with* the "nohdparm" option. The guide provides information about how to add extra options when booting, in general; 'nohdparm' is the specific option you need to add to your boot commandline.

I've tried all combinations of commenting out pm-powersave and using nohdparm. The computer is still unable to suspend and resume. Is there anything else to try from here? Do I need to provide more background information?

Steve Langasek (vorlon) wrote :

On Fri, May 28, 2010 at 03:16:28AM -0000, Alexander Johnson wrote:
> I've tried all combinations of commenting out pm-powersave and using
> nohdparm. The computer is still unable to suspend and resume.

In that case, you have an unrelated bug. Please file a separate bug report
on the linux package using the command 'ubuntu-bug linux'.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Changed in linux:
status: Invalid → Won't Fix
Changed in linux:
importance: Unknown → Medium
Changed in acpi-support (Ubuntu):
status: New → Confirmed
Marek Bardoński (bdfhjk) wrote :

Actual in Oneiric.

Please fix, and ask if need info. Unfortunatelly, I've found nothing in kern.log regarding system resume.

Thanks in advance!

papukaija (papukaija) wrote :

This bug was fixed im pm-utils. Please open a new bug (for linux package) if you still have issues with suspend/resume.

Changed in acpi-support (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.