Regression: Intrepid - 2.6.27 hangs on resume

Bug #273323 reported by Christian Schürer-Waldheim
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kdebase-workspace (Ubuntu)
Fix Released
Undecided
Unassigned
Nominated for Intrepid by Christian Schürer-Waldheim
linux (Ubuntu)
Invalid
Medium
Unassigned
Nominated for Intrepid by Christian Schürer-Waldheim
linux-meta (Ubuntu)
Invalid
Undecided
Unassigned
Nominated for Intrepid by Christian Schürer-Waldheim

Bug Description

I'm opening a new bug report as advised in Bug #272624. I have a Dell Latitude D620 which I could suspend and resume without any problems in kernel versions 2.6.27-2 and prior. But since 2.6.27-3, resume doesn't work anymore, even not with the just published 2.6.27-4. I've tried to debug this problem as shown at https://wiki.ubuntu.com/DebuggingKernelSuspend.

The debug-output shows the following.

First run:
[ 2.138531] Magic number: 8:539:138
[ 2.138536] block ram3: hash matches
[ 2.138696] rtc_cmos 00:06: setting system clock to 2008-09-22 17:06:11 UTC (1222103171)

Another run:
[ 2.130543] Magic number: 0:506:649
[ 2.130587] tty ptyde: hash matches
[ 2.130708] rtc_cmos 00:06: setting system clock to 2044-02-11 20:36:30 UTC (2338835790)

Another run:
[ 2.129585] Magic number: 0:953:872
[ 2.129588] hash matches /build/buildd/linux-2.6.27/drivers/base/power/main.c:390
[ 2.129659] tty ptytd: hash matches
[ 2.129754] rtc_cmos 00:06: setting system clock to 1992-06-13 06:51:46 UTC (708418306)

I'm not sure what is causing this problem now - the hardware didn't change and it even doesn't work anything detached.

Revision history for this message
Chris Coulson (chrisccoulson) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. Unfortunately we can't fix it, because your description didn't include enough information.

Please include the information as separate attachments:
 * Output of uname -a
    * uname -a > uname.txt
 * Output of sudo lspci -vvnn
    * sudo lspci -vvnn > lspci.txt
 * Output of sudo dmidecode
    * sudo dmidecode > dmidecode.txt
 * Try to suspend/hibernate and then restart the system and attach /var/log/kern.log.0 and /var/log/kern.log
 * Tarball of /proc/acpi directory. You can't just tar all files because their content sometimes changes etc.
    * cp -r /proc/acpi /tmp
    * tar -cvjf ~/acpi.tar.bz /tmp/acpi
    * attach acpi.tar.bz from your home directory
 * Please attach your /var/log/kern.log after following the debugging steps at https://wiki.ubuntu.com/DebuggingKernelSuspend (which you're already familiar with).

Thanks

Changed in linux:
assignee: nobody → chrisccoulson
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
Christian Schürer-Waldheim (quincunx) wrote :

Hi Chris,
thank you for your work investigating this problem.

I'll attach some of the files you asked for, maybe they are of some help for you already. The log files (specially the ones after a resume) will follow later.

Revision history for this message
Christian Schürer-Waldheim (quincunx) wrote :
Revision history for this message
Christian Schürer-Waldheim (quincunx) wrote :
Revision history for this message
Christian Schürer-Waldheim (quincunx) wrote :

Hi,

yesterday I had a crash after resume again, please find the kern.log (with comments) attached. Hope it is of any help for you, Chris.

Revision history for this message
Chris Coulson (chrisccoulson) wrote :
Download full text (3.6 KiB)

I notice you see this on resume from suspend:

Sep 25 00:13:20 quincunx kernel: [ 5942.889502] PM: resume devices took 5.468 seconds
Sep 25 00:13:20 quincunx kernel: [ 5942.889507] ------------[ cut here ]------------
Sep 25 00:13:20 quincunx kernel: [ 5942.889509] WARNING: at /build/buildd/linux-2.6.27/kernel/power/main.c:176 suspend_test_finish+0x75/0x80()
Sep 25 00:13:20 quincunx kernel: [ 5942.889510] Modules linked in: nls_iso8859_1 nls_cp437 vfat fat usb_storage libusual uvcvideo compat_ioctl32 videodev v4l1_compat af_packet binfmt_misc rfcomm l2cap bluetooth vboxdrv ppdev ipv6 acpi_cpufreq cpufreq_powersave cpufreq_conservative cpufreq_ondemand cpufreq_userspace cpufreq_stats freq_table container pci_slot sbs sbshc iptable_filter ip_tables x_tables parport_pc lp parport loop joydev arc4 ecb snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy pcmcia iTCO_wdt serio_raw iTCO_vendor_support snd_seq_oss psmouse yenta_socket rsrc_nonstatic snd_seq_midi pcmcia_core nvidia(P) snd_rawmidi i2c_core snd_seq_midi_event video output snd_seq snd_timer snd_seq_device snd soundcore wmi button battery ac shpchp pci_hotplug snd_page_alloc intel_agp dcdbas evdev ext3 jbd mbcache sha256_generic aes_x86_64 aes_generic cbc usbhid hid sg sd_mod crc_t10dif sr_mod cdrom pata_acpi ata_piix ata_generic libata scsi_mod ehci_hcd uhci_hcd usbcore dock dm_crypt crypto_blkcipher dm
Sep 25 00:13:20 quincunx kernel: mirror dm_log dm_snapshot dm_mod thermal processor fan fbcon tileblit font bitblit softcursor uvesafb cn fuse [last unloaded: cfg80211]
Sep 25 00:13:20 quincunx kernel: [ 5942.889557] Pid: 18348, comm: sleep.sh Tainted: P 2.6.27-4-generic #1
Sep 25 00:13:20 quincunx kernel: [ 5942.889558]
Sep 25 00:13:20 quincunx kernel: [ 5942.889559] Call Trace:
Sep 25 00:13:20 quincunx kernel: [ 5942.889563] [<ffffffff8024e8d4>] warn_on_slowpath+0x64/0x90
Sep 25 00:13:20 quincunx kernel: [ 5942.889568] [<ffffffff80518dd6>] ? printk+0x6c/0x6e
Sep 25 00:13:20 quincunx kernel: [ 5942.889573] [<ffffffff802126a4>] ? mcount_call+0x5/0x31
Sep 25 00:13:20 quincunx kernel: [ 5942.889575] [<ffffffff8027de35>] suspend_test_finish+0x75/0x80
Sep 25 00:13:20 quincunx kernel: [ 5942.889578] [<ffffffff8027df44>] suspend_devices_and_enter+0x104/0x1b0
Sep 25 00:13:20 quincunx kernel: [ 5942.889580] [<ffffffff8027e201>] enter_state+0xe1/0x110
Sep 25 00:13:20 quincunx kernel: [ 5942.889583] [<ffffffff8027e2ea>] state_store+0xba/0x100
Sep 25 00:13:20 quincunx kernel: [ 5942.889587] [<ffffffff803a55e7>] kobj_attr_store+0x17/0x20
Sep 25 00:13:20 quincunx kernel: [ 5942.889590] [<ffffffff8034ae0a>] sysfs_write_file+0xca/0x140
Sep 25 00:13:20 quincunx kernel: [ 5942.889593] [<ffffffff802eb81b>] vfs_write+0xcb/0x130
Sep 25 00:13:20 quincunx kernel: [ 5942.889596] [<ffffffff802eb975>] sys_write+0x55/0x90
Sep 25 00:13:20 quincunx kernel: [ 5942.889599] [<ffffffff8021288a>] system_call_fastpath+0x16/0x1b
Sep 25 00:13:20 quincunx kernel: [ 5942.889600]
Sep 25 00:13:20 quincunx kernel: [ 5942.889601] ---[ end trace 2b44916d971937bc ]---
Sep 25 00:13:20 quincunx kernel: [ 5942.889728] PM: Finishing wakeup.

I'm not sure whether this would be responsible for a crash thou...

Read more...

Revision history for this message
Christian Schürer-Waldheim (quincunx) wrote :

Hi Chris,

I've noticed this error message before and read the info in the sources too but it doesn't say anything to me. My computer is for sure not too slow, having a dual core CPU and 2GB of RAM. And it worked in the past without any problems. Hardy was the first time when I could suspend and resume countless times without any worries.

Yesterday I did a suspend with pm-suspend from the console. I could resume after that without any problem. Doing a 2nd suspend, the kernel hard crashed (flashing lights) after resume. This time I used the hot keys on my laptop computer, but obviously pm-suspend is called this way too (according to the suspend logs). I will attach the new kern.log.

Did something change in the last kernel updates which could cause such timing problems?

Revision history for this message
Christian Schürer-Waldheim (quincunx) wrote :

BTW, cannot the status be changed to confirmed? I think there is enough evidence that I'm seriously experiencing this problem, isn't it?

Changed in linux:
status: Incomplete → Confirmed
assignee: chrisccoulson → nobody
Changed in linux:
assignee: nobody → ubuntu-kernel-team
Revision history for this message
Christian Schürer-Waldheim (quincunx) wrote :

I've noticed that this problem is probably not directly caused by the kernel, thus I will add some other packages.

I just managed to suspend/resume several times with "sudo pm-suspend" from the terminal - without any problem. But when I use the hotkeys of my notebook computer for standby, then it suspends, but crashes on resume. I can see from the log files that pm-suspend is called this way too - but there seem to be more programs involved in the process. What else tries to take care about suspension? I've powerdevil for KDE installed, but before I had kde-guidance - it crashed with either of them. In the past (at hardy times), I had problems at resume with kde-guidance as well, but I could use kpowersave, which worked without any problem. So what does e.g. powerdevil do (despite of lock kde session at resume), that makes the kernel crash?

Revision history for this message
Peng Deng (d6g) wrote :

I'm also having this fail-to-resume regression on my laptop (Thinkpad T43 2668, Kernel 2.6.27-4-generic)

It worked fine with Hardy, after I installed the latest Intrepid freshly yesterday, resume stops working.

Attached is a tar ball including all the debug files Chris asked for in the first reply. The file dmesg.txt and kern-after-debug.log are obtained after following the steps introduced at https://wiki.ubuntu.com/DebuggingKernelSuspend

I am no sure if I should open a new bug report since Bug #272624 implied the problem could be hardware specific.

Revision history for this message
Peng Deng (d6g) wrote :

It seems my USB TV Card (WinTV HVR900) is affecting the suspend/resume.

I've managed several times successfully suspend/resume without the card plugged in, but once it is plugged, either suspend or resume would stop working. The interesting thing is, when it hangs during resume/suspend, and if I unplug the card, the resume/suspend will eventually success.

Attached is a recent version of /var/log/kern.log, which shows two tries of unplugging the card during resume, the resume took rather longer time (~60 s) to finish.

It could be an issue with the TV card driver, but I don't know how it affects the suspend/resume. Should I file another bug report regarding the driver?

Revision history for this message
Christian Schürer-Waldheim (quincunx) wrote :

Can you blacklist the driver for the USB device?

Does it make a difference if you use "sudo pm-suspend" to suspend?

Revision history for this message
Peng Deng (d6g) wrote : Re: [Bug 273323] Re: Regression: Intrepid - 2.6.27 hangs on resume

Christian, I guess I wouldn't blacklist the driver if I want to use
the device, plus I am not sure actually which driver to block (the
device will load several modules). However, I would try to replace it
with another em28xx-new driver, which seems more reliable, I just need
time to compile the latter.

No matter if the device is plugged in or not, sudo pm-suspend behaves
as the same as from the menu item here.

Cheers,
P.D.

On Thu, Oct 2, 2008 at 11:26 PM, Christian Schuerer <email address hidden> wrote:
> Can you blacklist the driver for the USB device?
>
> Does it make a difference if you use "sudo pm-suspend" to suspend?
>
> --
> Regression: Intrepid - 2.6.27 hangs on resume
> https://bugs.launchpad.net/bugs/273323
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Revision history for this message
Chris Coulson (chrisccoulson) wrote :

Christian - That's very odd that running pm-suspend manually works ok, but it fails if you press the sleep button. I'll have to have a think about that. When you say that kpowersave has worked in the past, it made me think slightly. Would you mind trying to run the following in a terminal, and then tell me the output (if there is any):

. /usr/share/acpi-support/policy-funcs
CheckPolicy

Changed in linux:
assignee: ubuntu-kernel-team → chrisccoulson
status: Confirmed → Incomplete
Revision history for this message
Christian Schürer-Waldheim (quincunx) wrote :

Hello P.D.

On Thursday 02 October 2008, Peng Deng wrote:
> Christian, I guess I wouldn't blacklist the driver if I want to use
> the device, plus I am not sure actually which driver to block (the
> device will load several modules). However, I would try to replace it
> with another em28xx-new driver, which seems more reliable, I just need
> time to compile the latter.
>
> No matter if the device is plugged in or not, sudo pm-suspend behaves
> as the same as from the menu item here.

I meant that you blacklist it from suspend[1], which would mean that the
driver is removed prior suspend and then loaded again. If you don't blacklist
it, then the system tries to suspend the device, which can cause problems if
the device and/or the driver do not properly support it. Thus removing it
before suspend and loading it again afterwards can have a different behavior.

[1] e.g. in /etc/pm/config.d/config:
SUSPEND_MODULES="$SUSPEND_MODULES snd_hda_intel"

Kind regards,
Christian

Revision history for this message
Christian Schürer-Waldheim (quincunx) wrote :

Hello Chris,

On Thursday 02 October 2008, Chris Coulson wrote:
> Christian - That's very odd that running pm-suspend manually works ok,
> but it fails if you press the sleep button. I'll have to have a think
> about that. When you say that kpowersave has worked in the past, it made
> me think slightly. Would you mind trying to run the following in a
> terminal, and then tell me the output (if there is any):
>
> . /usr/share/acpi-support/policy-funcs
> CheckPolicy

The output is a simple "1". What does that mean?

Kind regards,
Christian

Revision history for this message
Chris Coulson (chrisccoulson) wrote :

That's interesting. If you open up a terminal and run the following:

"acpi_listen"

Do you see any output when you press your sleep key (obviously you'll only see it briefly before the laptop goes to sleep)?

Revision history for this message
Chris Coulson (chrisccoulson) wrote :

Actually, of course you will see output otherwise your laptop wouldn't go to sleep ;)

Revision history for this message
Christian Schürer-Waldheim (quincunx) wrote :
  • kern1.log.zip Edit (84.7 KiB, application/zip; charset="utf-8"; name="kern1.log.zip")

The output of acpi_listen after I've pressed the sleep button is

button/sleep SBTN 00000080 00000001

The computer suspends then. When I resume, the screen flickers and two white,
horizontal, flickering lines stay for some seconds until the computer powers off
completely. When I power it on again, it runs a normal boot (repairing the file
system obviously). I will enclose a current kern.log. You can see that after
the resume pm says that it will suspend the system again. This system was
installed with hardy and updated to intrepid some time ago. Do you think there
is some old configuration mixed with new one?

Revision history for this message
Christian Schürer-Waldheim (quincunx) wrote :

Is it okay that acpi and hal are reacting on the sleep button event at the same time?

Revision history for this message
Chris Coulson (chrisccoulson) wrote :

I'm not sure. I'm fairly sure that there is a problem with acpi-support for Kubuntu users here, but I don't know if it is causing your problem. When you press the sleep button, acpid emits an event which HAL sees, and acpid also runs /etc/acpi/sleep.sh. This script should exit without doing anything if some other process in the users session exists to handle the sleep event. For Gnome users, this is gnome-power-manager (which the script checks for). I'm not familiar with what KDE uses, but the script is looking for kpowersave or klaptopdaemon, both of which I don't think exist. So, in your case, I think /etc/acpi/sleep.sh goes ahead and tries to suspend the machine. In the meantime, something else in your session is using pm-utils to suspend the machine too (powerdevil?).

I could be talking rubbish though - I'm really not familiar with KDE or power management in Kubuntu, but this is how I understand it to work.

It might be good if you could edit your /etc/acpi/sleep.sh and add an 'exit' just below 'DeviceConfig', and then try suspending using the standby button (remember to back the file up so you can restore it afterwards).

Thanks :)

Revision history for this message
Chris Coulson (chrisccoulson) wrote :

Thats why I asked you to run the CheckPolicy function earlier as well. It exited with '1' because it didn't find anything to handle the power management. On Ubuntu (Gnome), it exits with '0' because gnome-power-manager exists. This result is used by /etc/acpi/sleep.sh (and some other scripts) to determine whether to proceed or not.

Revision history for this message
Christian Schürer-Waldheim (quincunx) wrote :

After adding the exit command as you suggested, nothing happened when pressing the sleep button as powerdevil (which didn't come with default profiles and didn't have the option to react on sleep button events until the latest version) wasn't configured to issue any action. When I set up powerdevil to suspend my computer if the sleep button is pressed, I did so, but the resume didn't work as before. I don't know what powerdevil does in the event of a sleep request - maybe it does more than just calling pm-suspend. Currently, running "sudo pm-suspend" from the terminal is the best way to have a working resume. I can live with that.

BTW, after updating the the newest kernel version (2.6.27-5-generic) it seems that resume is faster than before - but maybe this is only because my computer is currently not docked into the docking station - will test that in the evening again.

Revision history for this message
Peng Deng (d6g) wrote :

Hi Christian,

I've install a new set of several kernel modules(em28xx-new) because
the old ones not only caused the resume problem but also worked not so
stable with my device. That was done before I could read your last
message, so I haven't tried your method of blacklisting the driver(s).
 However, the resume/suspend hasn't had any problem since I installed
the new driver.

I would be trying your method by reversing the kernel modules. But
since the device will load several kernel modules, should I blacklist
them all? Is there any efficient way of detecting which one of the
modules actually causes the resume problem?

Cheers,
P.D.

On Fri, Oct 3, 2008 at 12:34 AM, Christian Schuerer <email address hidden> wrote:
> Hello P.D.
>
> On Thursday 02 October 2008, Peng Deng wrote:
>> Christian, I guess I wouldn't blacklist the driver if I want to use
>> the device, plus I am not sure actually which driver to block (the
>> device will load several modules). However, I would try to replace it
>> with another em28xx-new driver, which seems more reliable, I just need
>> time to compile the latter.
>>
>> No matter if the device is plugged in or not, sudo pm-suspend behaves
>> as the same as from the menu item here.
>
> I meant that you blacklist it from suspend[1], which would mean that the
> driver is removed prior suspend and then loaded again. If you don't blacklist
> it, then the system tries to suspend the device, which can cause problems if
> the device and/or the driver do not properly support it. Thus removing it
> before suspend and loading it again afterwards can have a different behavior.
>
> [1] e.g. in /etc/pm/config.d/config:
> SUSPEND_MODULES="$SUSPEND_MODULES snd_hda_intel"
>
> Kind regards,
> Christian
>
> --
> Regression: Intrepid - 2.6.27 hangs on resume
> https://bugs.launchpad.net/bugs/273323
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Revision history for this message
Chris Coulson (chrisccoulson) wrote :

Christian - Could you restore the /etc/acpi/sleep.sh file to its previous state, and then run:

sudo su
sync; echo 1 > /sys/power/pm_trace; /etc/acpi/sleep.sh

Does resume fail (I expect that it will)? Can you attach your /var/log/kern.log after this attempt?

It might also be good for you to attach your /var/log/kern.log after a successful suspend/resume cycle with 'sudo pm-suspend'

Peng - Would you mind opening a separate bug report for your problem.

Thanks

Revision history for this message
Peng Deng (d6g) wrote :

Chris - As requested, a new bug report has been filed at
https://launchpad.net/bugs/279143

Christian - I've tried blacklisting several modules but it seems no
working but thanks for your suggestion. Now I feel the problem may
relate to a "missing" firmware.

Cheers,
P.D.

On Sun, Oct 5, 2008 at 5:52 PM, Chris Coulson
<email address hidden> wrote:
> Christian - Could you restore the /etc/acpi/sleep.sh file to its
> previous state, and then run:
>
> sudo su
> sync; echo 1 > /sys/power/pm_trace; /etc/acpi/sleep.sh
>
> Does resume fail (I expect that it will)? Can you attach your
> /var/log/kern.log after this attempt?
>
> It might also be good for you to attach your /var/log/kern.log after a
> successful suspend/resume cycle with 'sudo pm-suspend'
>
> Peng - Would you mind opening a separate bug report for your problem.
>
> Thanks
>
> --
> Regression: Intrepid - 2.6.27 hangs on resume
> https://bugs.launchpad.net/bugs/273323
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Revision history for this message
Scott Kitterman (kitterman) wrote :

powerdevil -> kdebase-workspace since powerdevil is part of core KDE now.

Changed in linux:
assignee: chrisccoulson → nobody
Revision history for this message
Chris Coulson (chrisccoulson) wrote :

I'm going to close the kernel task on this based on the earlier comments stating suspend only fails when called from powerdevil

Changed in linux:
status: Incomplete → Invalid
Changed in kdebase-workspace:
status: New → Incomplete
Changed in linux-meta:
status: New → Invalid
Revision history for this message
Steve Beattie (sbeattie) wrote :

This bug was reported in the Intrepid development cycle; removing regression-potential and marking as regression-release.

Revision history for this message
Jonathan Thomas (echidnaman) wrote :

Is this still an issue for you in Kubuntu 9.04?

Revision history for this message
Christian Schürer-Waldheim (quincunx) wrote :

No, it works fine in 9.04 and 9.10 (which I'm currently testing).

Thank you for all your work!

Changed in kdebase-workspace (Ubuntu):
status: Incomplete → Invalid
tags: removed: regression-release
Revision history for this message
Christian Schürer-Waldheim (quincunx) wrote :

I'd like to remove the nominations if it's possible somehow?

Revision history for this message
Jonathan Thomas (echidnaman) wrote :

Fix released would be more appropriate in this case, I think.

And about the nominations, I think that requires a core-dev to reject the nomination, but in reality it's not that big of a deal. ;-)

Changed in kdebase-workspace (Ubuntu):
status: Invalid → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.