[MSI MS-7374] suspend/resume failure [non-free: nvidia]

Bug #334644 reported by Victor B. Gonzalez
46
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Medium
linux (Ubuntu)
Won't Fix
Low
Unassigned

Bug Description

I cannot suspend to ram or disk on Jaunty. I never tried it before so I wouldn't know if it ever worked. I tried changing BIOS from S1 to S3 and changed /etc/default/acpi-support ACPI_SLEEP_MODE=mem to standby... I tried variations and mixes and suspend just wont work. I tried changing the wakeup mode in BIOS from BIOS to OS and back and more variations... Another thing, this bug reporting tool that automatically pops up, wont popup till about a day later sometimes and sometimes in the dozens... I've tried many things... I have no idea what the issue is. Anyone have a clue? How do I get started figuring this out? I am exhausted :(

ProblemType: KernelOops
Annotation: This occured during a previous suspend and prevented it from resuming properly.
Architecture: amd64
DistroRelease: Ubuntu 9.04
ExecutablePath: /usr/share/apport/apportcheckresume
Failure: suspend/resume
InterpreterPath: /usr/bin/python2.5
MachineType: MSI MS-7374
NonfreeKernelModules: nvidia
Package: linux-image-2.6.28-8-generic 2.6.28-8.26
ProcAttrCurrent: unconfined
ProcCmdLine: User Name=UUID=dfe5ba74-a7f8-4841-b55a-6412cf720f0a ro vga=795
ProcCmdline: /usr/bin/python /usr/share/apport/apportcheckresume
ProcEnviron: PATH=(custom, no user)
ProcVersionSignature: Ubuntu 2.6.28-8.26-generic
SourcePackage: linux
StressLog: Error: [Errno 2] No such file or directory: '/var/lib/pm-utils/stress.log'
Tags: resume suspend
Title: [MSI MS-7374] suspend/resume failure [non-free: nvidia]
UserGroups:

Revision history for this message
Victor B. Gonzalez (vbgunz) wrote :
Revision history for this message
TJ (tj) wrote :

Please do a shutdown/restart, suspend/resume test. Assuming the resume fails restart afresh and then attach /var/log/kern.log which should capture more information than the dmesg logs from the original report.

Changed in linux:
assignee: nobody → intuitivenipple
status: New → Incomplete
Revision history for this message
Victor B. Gonzalez (vbgunz) wrote :

I rebooted. logged in. suspended to ram. pretty quick shutdown. started up and it took about 1 minute staring at a black screen until the desktop appeared. everything looked great. the drive was dead (read only). needed to cold reboot the system. uploaded kern.log.

Revision history for this message
TJ (tj) wrote :

The attached kern.log doesn't reveal any obvious suspend-to-RAM attempts.

What sleep module is in use?

 grep SLEEP_MODULE /etc/pm/config.d/00sleep_module

The log does however contain several hibernation cycles (suspend-to-disk), such as:

Feb 28 10:46:24 box kernel: [ 229.128539] PM: Creating hibernation image:
Feb 28 10:46:24 box kernel: [ 229.132001] PM: Need to copy 123162 pages
Feb 28 10:46:24 box kernel: [ 229.132001] PM: Normal pages needed: 123162 + 1024 + 120, available pages: 1433089
Feb 28 10:46:24 box kernel: [ 229.132001] PM: Hibernation image created (123162 pages copied)

Feb 28 10:46:24 box kernel: [ 230.512278] PM: writing image.
Feb 28 10:46:24 box kernel: [ 230.512288] PM: Free swap pages: 3074705
Feb 28 10:46:24 box kernel: [ 230.606784] PM: Saving image data pages (123403 pages) ...

Feb 28 10:46:24 box kernel: [ 240.219198] PM: Wrote 493612 kbytes in 9.61 seconds (51.36 MB/s)
Feb 28 10:46:24 box kernel: [ 240.219851] PM: S<3>PM: Swap header not found!
Feb 28 10:46:24 box kernel: [ 240.263198] |
Feb 28 10:46:24 box kernel: [ 240.314739] Restarting tasks ... done.
Feb 28 10:46:24 box kernel: [ 240.316316] PM: Basic memory bitmaps freed

Feb 28 10:55:54 box kernel: [ 391.832892] PM: Basic memory bitmaps created
Feb 28 10:56:19 box kernel: [ 391.832893] PM: Syncing filesystems ... done.
Feb 28 10:56:21 box kernel: [ 391.847976] Freezing user space processes ... (elapsed 0.00 seconds) done.
Feb 28 10:56:21 box kernel: [ 391.848755] Freezing remaining freezable tasks ... (elapsed 0.02 seconds) done.
Feb 28 10:56:21 box kernel: [ 391.878673] PM: Shrinking memory... done (174075 pages freed)
Feb 28 10:56:21 box kernel: [ 400.366355] PM: Freed 696300 kbytes in 8.48 seconds (82.11 MB/s)

Feb 28 10:56:21 box kernel: [ 402.312531] PM: Creating hibernation image:
Feb 28 10:56:21 box kernel: [ 402.316002] PM: Need to copy 121382 pages
Feb 28 10:56:21 box kernel: [ 402.316002] PM: Normal pages needed: 121382 + 1024 + 120, available pages: 1434878
Feb 28 10:56:21 box kernel: [ 402.316002] PM: Hibernation image created (121382 pages copied)

Feb 28 10:56:21 box kernel: [ 404.295111] PM: writing image.
Feb 28 10:56:21 box kernel: [ 404.295121] PM: Free swap pages: 3052264
Feb 28 10:56:21 box kernel: [ 404.295248] PM: Saving image data pages (121620 pages) ...

Feb 28 10:56:21 box kernel: [ 413.654357] PM: Wrote 486480 kbytes in 9.35 seconds (52.02 MB/s)
Feb 28 10:56:21 box kernel: [ 413.654502] PM: S<3>PM: Swap header not found!
Feb 28 10:56:21 box kernel: [ 413.687152] |
Feb 28 10:56:21 box kernel: [ 413.738375] Restarting tasks ... done.
Feb 28 10:56:21 box kernel: [ 413.740068] PM: Basic memory bitmaps freed

Revision history for this message
TJ (tj) wrote :

This is what a true suspend-to-RAM looks like:

Suspend...

[12964.851509] PM: Preparing system for mem sleep
[12964.851512] Freezing user space processes ... (elapsed 0.00 seconds) done.
[12964.857033] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
[12964.867857] PM: Entering mem sleep
[12964.867872] Suspending console(s) (use no_console_suspend to debug)

[12968.600209] PM: suspend devices took 3.736 seconds

Resume...

[12968.840413] Back to C!
[12968.840413] Enabling non-boot CPUs ...

[12973.199626] PM: resume devices took 4.108 seconds
[12973.199717] PM: Finishing wakeup.
[12973.199718] Restarting tasks ... done.

Revision history for this message
Victor B. Gonzalez (vbgunz) wrote :

I managed to save this dmesg *after* a failed resume. in it (I believe around line 1283, you'll see my sda disk failure). I am attaching the dmesg info to this bug.

Revision history for this message
Victor B. Gonzalez (vbgunz) wrote :

attaching the result of $ ls -l /dev/disk/by-path/*

Revision history for this message
TJ (tj) wrote :

The problem appears to be that during suspend the Interrupt for the Nvidia MCP78S (PCI 00:09.0) isn't being disabled, nor re-enabled on resume (other AHCI-managed controllers are - see 07.00.0).

Start-up:

[ 1.364523] ahci 0000:00:09.0: version 3.0

[ 1.364948] ahci 0000:00:09.0: PCI INT A -> Link[LSA0] -> GSI 23 (level, low) -> IRQ 23
[ 1.364978] ahci 0000:00:09.0: irq 2299 for MSI/MSI-X
[ 1.365029] ahci 0000:00:09.0: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0x3f impl IDE mode
[ 1.365034] ahci 0000:00:09.0: flags: 64bit ncq sntf led clo pmp pio
[ 1.365037] ahci 0000:00:09.0: setting latency timer to 64
[ 1.365410] scsi0 : ahci

[ 1.366024] ata1: SATA max UDMA/133 abar m8192@0xf8c76000 port 0xf8c76100 irq 2299

[ 1.848080] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 1.892134] ata1.00: ATA-7: ST3320620AS, 3.AAE, max UDMA/133
[ 1.892139] ata1.00: 625142448 sectors, multi 16: LBA48 NCQ (depth 31/32)
[ 1.950480] ata1.00: configured for UDMA/133

[ 3.548102] scsi 0:0:0:0: Direct-Access ATA ST3320620AS 3.AA PQ: 0 ANSI: 5
[ 3.548237] sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors: (320 GB/298 GiB)

[ 3.565485] sd 0:0:0:0: [sda] Attached SCSI disk

Suspend...

[ 336.957742] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 336.957774] sd 0:0:0:0: [sda] Stopping disk

Resume...

[ 339.324028] ahci 0000:00:09.0: restoring config space at offset 0x1 (was 0xb00107, writing 0xb00507)
[ 339.324043] ahci 0000:00:09.0: setting latency timer to 64

[ 340.081785] sd 0:0:0:0: [sda] Starting disk

[ 348.512019] ata1.00: qc timeout (cmd 0xec)
[ 348.512021] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)

Revision history for this message
Victor B. Gonzalez (vbgunz) wrote :

I think I may have solved my suspend/resume woes with a short-term solution. I mean short-term because although this works for me (suspending to ram/resuming from ram) I have to add a kernel parameter to grub def options and I've only tested it twice (both resumes 100%). I simply added "pci=nomsi" to the kernel launch options. I can live with it *but* think this is something of a solution that should have been automated at some point?

TJ (tj)
Changed in linux:
assignee: intuitivenipple → nobody
status: Incomplete → Confirmed
Changed in linux:
status: Unknown → In Progress
Revision history for this message
Victor B. Gonzalez (vbgunz) wrote :

I thought I chime in with some more goodies thats making my experience on Jaunty a whole lot better when it comes to this stuff. The application "powerdevil" just didn't want to work for me out of the box. I did manage to get it working (somewhat) by installing "cpufrequtils". This definitely got "When the system is idle for more than [N]" working but everything else just felt broken. The power/sleep buttons do nothing (absolutely nothing, system shuts down instead), I didn't see my displays exhibit power management, etc *but* thats fine.

What I was after was this behavior. 1. press the power button and "suspend to ram". 2. wake up to a locked screen. No matter though, powerdevil wouldn't get the lock screen right until I installed "cpufrequtils". Then I had to do one more thing. a big temporary hack I hope *but* I copied /etc/acpi/powerbtn.sh to /etc/acpi/powerbtn.sh__backup and replaced the contents of the original file with the following.

#!/bin/sh
# no logic, just hope to lockup and suspend. KDE4.2
qdbus org.freedesktop.ScreenSaver /ScreenSaver Lock
pm-suspend

Well, now, the powerbutton on the box when tapped, locks up and suspends just fine. When I resume, I come back to a locked screen.I would really like to start removing these hacks though and just see Kubuntu work pretty much out of the box. Can someone offer some better solutions? I would appreciate it!

Thanks!

PS. selecting suspend to ram from kickoff still wakes up without a prompt for a password. Although I may not be using powerdevil directly to suspend using either method, it will suspend after X amount of time and will resume with the screen locked but thats about it with powerdevil :/

Revision history for this message
Victor B. Gonzalez (vbgunz) wrote :

Suspend to disk is still a nightmare. It takes about 1 minute (to turn off). Resuming, well, I cannot enter into my bios so something worked? *but* I do not see anything about a resume image, I see a bunch of orphaned files get cleaned up, then back to the KDM screen. So, resume isn't working (not sure where it breaks)?

Any question, please ask. So far Suspend to Ram seems to work flawlessly.

Revision history for this message
Victor B. Gonzalez (vbgunz) wrote :

I finally got suspend to ram and suspend to disk working. For suspend to ram, read above. for suspend to disk, I needed to add yet another kernel parameter "resume=/path/to/swap". Now. Just like above, the kickoff options under the leave tab do *not* call pm-suspend and pm-hibernate respectively and in fact are more problematic than they should be. They both don't lock the screen and the disk option coughs up more errors on suspend.

So again, I created a script just like above but instead of calling pm-suspend, it calls pm-hibernate.

Now, not sure if this is a Jaunty thing (alpha warnings, expect brokenness) *but* upon a second resume from disk, my networking went down for the count. No amount of sudo /etc/init.d/networking stop|start|restart would save the day. I am not too smart so I just did a reboot and hoped for the best. By the way, that happened from selecting the option twice from kickoff.

I did a suspend to disk again on a fresh boot *but* this time, twice I called pm-hibernate instead. Both resumes seem just fine and networking had no hickups. The only issue and on the second resume (probably an alpha thing non-related to suspending to disk) is I lost some of my shortcuts e.g., Ctrl+Esc, Alt+F2, etc. Not sure if this was caused by suspending to disk.

Anyhow, I thought I chime in again with some new on my process. Hopefully this helps someone out :)

TJ (tj)
Changed in linux (Ubuntu):
assignee: nobody → intuitivenipple
importance: Undecided → Low
Revision history for this message
TJ (tj) wrote :

Upstream now has a patch to fix the resume issue "[PATCH] pci: apply nv_msi_ht_cap_quirk on resume too" originally posted at: http://lkml.org/lkml/2009/7/8/32.

It isn't in upstream git yet but is so trivial we should have no issues in using it

Revision history for this message
dE (de-techno) wrote :

It's not working on 6100 chip

Changed in linux (Ubuntu):
assignee: TJ (intuitivenipple) → nobody
Revision history for this message
Victor B. Gonzalez (vbgunz) wrote :

I had this issue resolved for me quite a while ago. I thought for some reason intuitivenipple closed this. Sorry.

Changed in linux:
importance: Unknown → Medium
Revision history for this message
Brad Figg (brad-figg) wrote : Unsupported series, setting status to "Won't Fix".

This bug was filed against a series that is no longer supported and so is being marked as Won't Fix. If this issue still exists in a supported series, please file a new bug.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: Confirmed → Won't Fix
Changed in linux:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.