Mint 17 fails to resume after suspend on Acer Extensa 5220 with Intel GM965/GL960 (GMA X3100) based graphics chipset

Bug #1390923 reported by Jean-Pierre van Riel
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux Mint
New
Undecided
Unassigned
xf86-video-intel
Invalid
Medium

Bug Description

1)

Mint Version: Linux Mint 17 Cinnamon 32 bit,
Kernel Version: 3.13.0-29-generic
Laptop: Acer Extensa 5220
Chipset: Intel GM965/GL960 (GMA X3100)
Suspected driver at fault: xf86-video-intel

2)

Suspend and resume does not work reliably. After suspending, pressing any key or the power button causes the system to attempt to resume, but during the process it reboots instead of resuming. Only on rare occasions did resume work, so fault appeared intermittent.

After much experimentation the following process confirms the bug is related to the Intel graphics driver and DDX 2D acceleration mode by changing X org config:
- test suspend and resume with Option "AccelMethod" "sna"and resume from suspend often fails and causes reboot
- test suspend and resume with Option "AccelMethod" "uxa" and resume from suspend seems reliable (no reboot noticed yet)

Example config to test with
/etc/X11/xorg.conf.d/20-intel.conf
  Section "Device"
     Identifier "Intel Graphics"
     Driver "intel"
     #Option "AccelMethod" "uxa"
     Option "AccelMethod" "sna"
  EndSection

3) What happened?

Laptop rebooted instead of resuming

4) What you expected to happen instead.

Expected resume to restore current session with windows previously open, etc

5) If the problem happened once, sometimes, or always.

Very frequent. I'd say only 1 in 10 resume attempts (with Option "AccelMethod" "sna") worked and the rest failed and caused a reboot.

6) Other important observations and notes

Attempted to follow some suggestions here: https://01.org/linuxgraphics/documentation/how-debug-suspend-resume-issues-0
- resume from S3 suspend to RAM without mdm (display manager) running appeared to work fine, e.g. 'echo mem > /sys/power/state'
- the default suspend via menu GUI in Cinnamon often caused the resume bug to surface
- S4 hibernation worked fine and didn't seem to trigger the bug

Found several similar bug reports for Intel 965 / X3100 type laptops and Intel drivers in Ubuntu 14.04 and or Kernel 3.13. My suspicion is newer Intel video drivers have caused a number of regressions in the latest Kernels, especially with older Intel Chipsets.
- [TOSHIBA Satellite U400] suspend/resume failure: https://bugs.launchpad.net/ubuntu/+source/linux-meta/+bug/1290787
- [965gm regression v3.13] TOSHIBA Satellite U400 intel GM965/GL960 suspend/resume failure kernel 3.14 rc7, rc6, 3.13: https://bugs.freedesktop.org/show_bug.cgi?id=76520
- [Dell Inspiron 1525] Cannot resume from suspend: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1331654
- Bug#747247: Problem Solved: https://lists.debian.org/debian-kernel/2014/07/msg00179.html

Similar bugs and fixes for drm/i915 already backported to ubuntu kernel 3.13.0.35
http://changelogs.ubuntu.com/changelogs/pool/main/l/linux/linux_3.13.0-39.66/changelog
* drm/i915: Avoid div-by-zero when pixel_multiplier is zero
  - LP: #1347088

So the problem I had was additional issue not yet fixed by those previous patches. During reading up, I noticed that a change had occurred in the default acceleration method lntel drivers a while ago. I stumbled on other bugs like this one:
- Re: Fonts in Firefox "crumble" (Mint 16 Mate RC): http://forums.linuxmint.com/viewtopic.php?f=47&t=151054&start=40
- The workaround/fix was to switch back to "uxa"

After more reading up:
* Three backends backends available for accelerating DDX: UXA, SNA and glamor
* DDX: Device Dependent X, and is part of the 2D graphics device driver which is hardware specific.
* The old methode is UXA - (Unified Acceleration Architecture). More mature and introduced to support the GEM driver model.
* The new methode is SNA - (Sandybridge's New Acceleration). Targets newer chipsets.
* Ubuntu 14.04 xf86-video-intel X.org driver uses the newer SNA acceleration by default.

My hardware was older (not SandyBridge), so reverting to the UXA acceleration option appears to be a promising workaround as using the SNA (default) option somehow triggers the resume bug. SNA is probably not well suited to or optimised for older graphics like X3100 :-/

A possible solution is for upstream maintainers (or Mint/Ubuntu via a patch) to revert to UXA if older Intel hardware is present and not force the SNA default. The proper fix might be to trace how SNA acceleration causes the suspend resume regression bug to surface, but given the resume crashes and reboots the system, it's hard to find evidence in logs.

description: updated
Revision history for this message
In , Jean-Pierre van Riel (jpvr) wrote :

Created attachment 109158
Xorg.0.log with SNA option

Bug description:

Resume from suspend on Intel GM965/GL960 (GMA X3100) with DDX SNA fails and causes a reboot, but resume works fine if the non-default UXA DDX acceleration option is used.

System environment:

-- chipset: Intel GM965/GL960 (GMA X3100)
-- system architecture: 32-bit (i686)
-- xf86-video-intel: 2:2.99.910-0ubuntu1.1
-- xserver: 2:1.15.1-0ubuntu2.1
-- kernel: 3.13.0-39
-- Linux distribution: Mint 17 Quiana, Cinnamon, 32-bit
-- Machine or mobo model: Acer Extensa 5220, model: 5220-051G08Mi MS2205
-- Display connector: standard laptop lcd

Reproducing steps:

- trigger bug with Option "AccelMethod" "sna"and resume from suspend often causes reboot
- avoid bug with with Option "AccelMethod" "uxa" and resume from suspend seems reliable (no reboot noticed yet after several tests)

Example config to test with
/etc/X11/xorg.conf.d/20-intel.conf
  Section "Device"
     Identifier "Intel Graphics"
     Driver "intel"
     #Option "AccelMethod" "uxa"
     Option "AccelMethod" "sna"
  EndSection

Probability: Very frequent. I'd say only 1 in 10 resume attempts (with Option "AccelMethod" "sna") worked and the rest failed and caused a reboot.

Attempted to follow the guide here: https://01.org/linuxgraphics/documentation/how-debug-suspend-resume-issues-0
- resume from S3 suspend to RAM without mdm (display manager) running appeared to work fine, e.g. 'echo mem > /sys/power/state'
- the default suspend via menu GUI in Cinnamon often caused the resume bug to surface
- S4 hibernation worked fine and didn't seem to trigger the bug
- given the system reboots, unable to login via another terminal or ssh and cannot capture dmesg after, etc
- have not had time to try use intel_reg_dumper or intel_gpu_dump output

Found several similar bug reports for Intel 965 / X3100 type laptops and Intel drivers in Ubuntu 14.04 and or Kernel 3.13. My suspicion is newer Intel video drivers have caused a number of regressions in the latest Kernels, especially with older Intel Chipsets. These are similar bugs, but there fixes don't appear to work in my case:
- [TOSHIBA Satellite U400] suspend/resume failure: https://bugs.launchpad.net/ubuntu/+source/linux-meta/+bug/1290787
- [965gm regression v3.13] TOSHIBA Satellite U400 intel GM965/GL960 suspend/resume failure kernel 3.14 rc7, rc6, 3.13: https://bugs.freedesktop.org/show_bug.cgi?id=76520
- [Dell Inspiron 1525] Cannot resume from suspend: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1331654

Logged bug specific to Mint here, but realised this is quite probably an upstream bug and probably not specific to Mint: https://bugs.launchpad.net/ubuntu/+bug/1390923

Revision history for this message
In , Jean-Pierre van Riel (jpvr) wrote :

Changed bug to 'normal' as I'm not sure if and how this affects other laptop models with GM965/GL960. Haven't 100% ruled out if it's distro or BIOS specific, but do know Xorg.conf and UXA workaround show it's probably a regression in SNA code that triggers the issue.

Revision history for this message
In , Jean-Pierre van Riel (jpvr) wrote :

Created attachment 109159
Xorg.0.log with UXA option

Revision history for this message
In , Jean-Pierre van Riel (jpvr) wrote :

Created attachment 109160
dmesg before (success)

Revision history for this message
In , Jean-Pierre van Riel (jpvr) wrote :

Created attachment 109161
dmesg after (success)

Note, I don't have a valid dmesg after in the failure to resume scenario because a reboot is caused by the bug.

Revision history for this message
In , Jean-Pierre van Riel (jpvr) wrote :

Created attachment 109162
i915 module parameters

Revision history for this message
In , Jean-Pierre van Riel (jpvr) wrote :

Created attachment 109164
Cannot identify error in pm-suspend.log and think bug is triggered way before resume scripts run

When first noticing the bug, I did multiple tests and the following pattern are seen in the pm-suspend.log. Unfortunately, I was unable to catch any error here, even when I later modified the script using 'set -x' to try catch which script in the resume process was causing the issue, but that was futile since the resume scripts never even get triggered and the reboot occurs before then. So I think I can rule out any scripts called during resume being the cause.

# When it is able to resume #

Tue Sep 2 00:06:02 SAST 2014: Running hooks for suspend.
...
Tue Sep 2 00:06:02 SAST 2014: performing suspend
Tue Sep 2 00:06:20 SAST 2014: Awake.
Tue Sep 2 00:06:20 SAST 2014: Running hooks for resume
...
Running hook /usr/lib/pm-utils/sleep.d/000kernel-change resume suspend:
/usr/lib/pm-utils/sleep.d/000kernel-change resume suspend: success.

Tue Sep 2 00:06:20 SAST 2014: Finished.

# When it fails to resume #

Tue Sep 2 00:10:19 SAST 2014: Running hooks for suspend.
...
Tue Sep 2 00:10:19 SAST 2014: performing suspend
...
??? not followed by <date>: Awake. ???
...

Instead, in the failure case, we don't get the awake message. The next entry in the log is for the next suspend resume test ater rebooting. I.e. instead of seeing awake, we see the next test.
Tue Sep 2 00:11:34 SAST 2014: Running hooks for suspend.

Revision history for this message
In , Jean-Pierre van Riel (jpvr) wrote :

Created attachment 109166
UXA fails too, but not as often as SNA

Sadly it appears SNA isn't fully at fault. It just triggers the bug less frequently compared to SNA. After about 10 successful resumes, I had the same failure with UXA configured.

Attached pm-suspend.log which I copied after the system rebooted when attempting to resume. As before, no resume scripts called (or at least no logs written to disc during resume process).

Revision history for this message
Jean-Pierre van Riel (jpvr) wrote :

Sadly, while UXA doesn't trigger the suspend resume bug as often as SNA, it still sometimes occurs. I've attempted to log a bug upstream related to the video driver package xf86-video-intel.

Revision history for this message
In , Chris Wilson (ickle) wrote :

*** This bug has been marked as a duplicate of bug 76554 ***

Revision history for this message
Jean-Pierre van Riel (jpvr) wrote :

Other similar bugs for the laptop were logged, but not clear if they had been resolved as they either expired or were triaged .
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1278679
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1285938

Changed in xserver-xorg-video-intel:
importance: Unknown → Medium
status: Unknown → Invalid
penalvch (penalvch)
no longer affects: ubuntu
Revision history for this message
Jean-Pierre van Riel (jpvr) wrote :

I've retested with the latest kernel on Mint 17.0, 3.13.46, and despite bug 76554 being marked as resolved and there being some signs that a fix may be backported into the 3.13 kernel series for Ubuntu.

So if 76554's fix is in kernel 3.13.0-46, then 76554 hasn't resolved the bug I raised here.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.