[ffe] unplugging an external monitor from laptop results in corrupted screen. Logging out fixes it.

Bug #1157678 reported by Christopher Barrington-Leigh on 2013-03-20
96
This bug affects 12 people
Affects Status Importance Assigned to Milestone
xf86-video-intel
Fix Released
Medium
xserver-xorg-video-intel (Ubuntu)
Medium
Chris J Arges

Bug Description

[Impact]
With updated 13.04 beta, unplugging an external monitor from laptop results in corrupted screen. Logging out fixes it.
A screen capture looks normal (not like what the display is showing!).

Plugging an external monitor in does not produce any similar problem.

[Test Case]
1. On affected hardware, boot with an external monitor connected
2. Unplug the external monitor

Expected behavior: Normal screen
Actual behavior: Screen corruption

[Development Fix]
The fix for this issue is a cherrypick of an upstream patch with a minor change to make it apply to raring. (See comment #56)

The upstream patch is rather large (~228 lines), but half of that is simple code refactoring (moving some common code into functions, moving some functions earlier in their files so they can be referenced in other routines, etc.) For review purposes, a cleaned up (uncleaned up?) version of the patch is available in comment #59, which shows just the functional changes to the code.

[Regression Potential]
The patch originates from an upstream change that has not been part of a release yet, but has gone through upstream review and testing.

This patch primarily *adds* code, and most of that is cleanup code to clear graphics memory and free pointers, which should be quite safe. However, cleanup code can sometimes expose pre-existing bugs (e.g. freeing invalid pointers that otherwise would have been ignored). We don't expect that to happen, though.

A few while loops are added. It does not look like the code could get stuck in those loops, though.

Types of symptoms to look for include X server crashes, gpu lockups, and screen corruption, particularly associated with screen initialization, destruction, or resizing.

ProblemType: Bug
DistroRelease: Ubuntu 13.04
Package: xorg 1:7.7+1ubuntu4
ProcVersionSignature: Ubuntu 3.8.0-13.23-generic 3.8.3
Uname: Linux 3.8.0-13-generic x86_64
.tmp.unity.support.test.0:

ApportVersion: 2.9.2-0ubuntu1
Architecture: amd64
CompizPlugins: No value set for `/apps/compiz-1/general/screen0/options/active_plugins'
CompositorRunning: compiz
CompositorUnredirectDriverBlacklist: '(nouveau|Intel).*Mesa 8.0'
CompositorUnredirectFSW: true
Date: Wed Mar 20 07:45:34 2013
DistUpgraded: Fresh install
DistroCodename: raring
DistroVariant: ubuntu
DkmsStatus: virtualbox, 4.2.8, 3.8.0-13-generic, x86_64: installed
ExtraDebuggingInterest: Yes, if not too technical
GraphicsCard:
 Intel Corporation Core Processor Integrated Graphics Controller [8086:0046] (rev 02) (prog-if 00 [VGA controller])
   Subsystem: Lenovo Device [17aa:21c1]
InstallationDate: Installed on 2013-03-18 (1 days ago)
InstallationMedia: Ubuntu 13.04 "Raring Ringtail" - Alpha amd64 (20130318)
MachineType: LENOVO 2901CTO
MarkForUpload: True
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.8.0-13-generic root=UUID=fcf8343d-2c1d-4f25-ab52-876891045ee9 ro quiet splash vt.handoff=7
SourcePackage: xorg
Symptom: display
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/25/2012
dmi.bios.vendor: LENOVO
dmi.bios.version: 6UET69WW (1.49 )
dmi.board.name: 2901CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr6UET69WW(1.49):bd04/25/2012:svnLENOVO:pn2901CTO:pvrThinkPadT410s:rvnLENOVO:rn2901CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 2901CTO
dmi.product.version: ThinkPad T410s
dmi.sys.vendor: LENOVO
version.compiz: compiz 1:0.9.9~daily13.03.08-0ubuntu1
version.ia32-libs: ia32-libs N/A
version.libdrm2: libdrm2 2.4.42-0ubuntu2
version.libgl1-mesa-dri: libgl1-mesa-dri 9.0.3-0ubuntu1
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A
version.libgl1-mesa-glx: libgl1-mesa-glx 9.0.3-0ubuntu1
version.xserver-xorg-core: xserver-xorg-core 2:1.13.3-0ubuntu2b1
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.7.3-0ubuntu2b2
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:7.1.0-0ubuntu1b1
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.21.4-0ubuntu1b1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.6-0ubuntu3b1
xserver.bootTime: Wed Mar 20 07:41:30 2013
xserver.configfile: default
xserver.logfile: /var/log/Xorg.0.log
xserver.version: 2:1.13.3-0ubuntu2b1
xserver.video_driver: intel

Created attachment 70128
broken background picture

When I suspend an undocked T61 with just LCD screen
and I make resume in docking station with attached 2nd
display over standard VGA docking port - I get
damaged background image.

Attaching bad picture grabbed after resume,
and also how it should actually look.

hw: T61, gma965
intel git: d2897cb0136ffec83365c7530ed544b562cac478
xorg-x11-server-Xorg-1.13.0-7.fc19.x86_64

Created attachment 70129
correct background image

$ xrandr
Screen 0: minimum 320 x 200, current 2960 x 1050, maximum 32767 x 32767
LVDS1 connected 1680x1050+0+0 (normal left inverted right x axis y axis) 331mm x 207mm
   1680x1050 60.0*+ 50.0
   1400x1050 60.0
   1280x1024 60.0
   1280x960 60.0
   1024x768 60.0
   800x600 60.3 56.2
   640x480 59.9
VGA1 connected 1280x1024+1680+0 (normal left inverted right x axis y axis) 338mm x 270mm
   1280x1024 60.0*+ 75.0
   1280x960 60.0
   1280x800 74.9 59.8
   1152x864 75.0
   1280x768 74.9 59.9
   1024x768 75.1 70.1 60.0
   1024x576 60.0
   800x600 72.2 75.0 60.3 56.2
   848x480 60.0
   640x480 72.8 75.0 60.0
   720x400 70.1

To get correct background picture - only refresh event is usually required - i.e. to move some window over the background

(In reply to comment #0)
> When I suspend an undocked T61 with just LCD screen
> and I make resume in docking station with attached 2nd
> display over standard VGA docking port - I get
> damaged background image.

Just so that I am clear on how to reproduce:

suspend; plug VGA in; resume

Where do you see the damage? On the LVDS, on the VGA or both?

(In reply to comment #2)
> (In reply to comment #0)
> > When I suspend an undocked T61 with just LCD screen
> > and I make resume in docking station with attached 2nd
> > display over standard VGA docking port - I get
> > damaged background image.
>
> Just so that I am clear on how to reproduce:
>
> suspend; plug VGA in; resume
>
> Where do you see the damage? On the LVDS, on the VGA or both?

I start gnome Xsession in undocked configuration
(Screen 0: minimum 320 x 200, current 1680 x 1050, maximum 32767 x 32767)

then suspend
dock laptop (docking has attached VGA monitor)
and resume laptop

both screens are showing damaged background picture
(haven't tried to reproduce without docking station actually)

One thing to also check for is a GPU hang. Can you have a quick look in dmesg/Xorg.log for the warning?

Just tried to reproduce this on my i965gm... It seems that the HWS page was not restored... Instant death upon resume.

(In reply to comment #4)
> One thing to also check for is a GPU hang. Can you have a quick look in
> dmesg/Xorg.log for the warning?
>
> Just tried to reproduce this on my i965gm... It seems that the HWS page was
> not restored... Instant death upon resume.

Speaking of hangs - yes I'm noticing recently significant problems with suspend resume - but usually the scenario:

suspend undock home/resume dock work/suspend dock work/resume undock home works.
(with 1 per week failure - see my recent lkml post:
https://lkml.org/lkml/2012/11/15/369
unsure wheather it's related or not)

However if I try multiple suspend/resume in dock - it doesn't take long
to experience black-screen death.

Unfortunately also my 'serial line' debugging now fails since after resume I get just garbage on my serial line. But currently I do not have much free time to play with all this.

So my 'resume' deadlock has been fully resolved here:

https://bugzilla.kernel.org/show_bug.cgi?id=51071

so the problem with broken 'background' is left ;)

I should probably also aim for fixing the serial line problem.

Haven't had this happen to me yet. Is it still a regular occurrence for you?

Ye still happens to me all the time with current gnome2 environment I'm using on my rawhide.

To clarify: Is this with suspend-to-mem or hibernate-to-disk?

(In reply to comment #9)
> To clarify: Is this with suspend-to-mem or hibernate-to-disk?

Suspend to ram - I'm not using hibernation

Hmm, I've seen that pattern when I kill nautilus. Coincidence? Unlikely.

And it's gone...

Now that I try to actually reproduce that pattern with nautilus it just works. So perhaps it is related to the recent bug fixes - except that your report is older than that chunk of code. I think my nautilus reproducer was pure coincidence.

Hi,

I have a similar issue, without the suspend/resume cycle on gen4.

For me it is sufficient to attach an external screen with display port and

xrandr --output DP1 --auto --primary --output LVDS1 --off

This is 100% reproducible.

curiously enough if I do

xrandr --output DP1 --primary --auto --output LVDS1 --auto --right-of DP1

and then

xrandr --output DP1 --auto --primary --output LVDS1 --off

things are fine.

Please see the attachment for an illustration of the issue.

Created attachment 75673
Corrupted display

This is my rendering issue. It does not only involve the 'background' but every element on the screen. When in this state, even new windows do not show up correctly.

Timo Aaltonen (tjaalton) on 2013-03-21
affects: xorg (Ubuntu) → xserver-xorg-video-intel (Ubuntu)
Chris Wilson (ickle) wrote :

Can you please attach a photo of the screen corruption?

bugbot (bugbot) on 2013-03-21
tags: added: dual-head
Timo Aaltonen (tjaalton) on 2013-03-21
Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Incomplete

Plugging the monitor back in now returns the screen to normal.
I'm attaching video of the corruption.

commit 4878cae22a2405b6d33318e2dc99a9c1367fee44
Author: Ville Syrjälä <email address hidden>
Date: Mon Feb 18 19:08:48 2013 +0200

    drm/i915: Really wait for pending flips when panning

Chris Wilson (ickle) wrote :

Fixed by upstream

commit 4878cae22a2405b6d33318e2dc99a9c1367fee44
Author: Ville Syrjälä <email address hidden>
Date: Mon Feb 18 19:08:48 2013 +0200

    drm/i915: Really wait for pending flips when panning

Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → In Progress
Changed in xserver-xorg-video-intel:
importance: Unknown → Medium
status: Unknown → Fix Released

*** Bug 62724 has been marked as a duplicate of this bug. ***

On Tue, Mar 26, 2013 at 12:16:54PM -0000, Chris J Arges wrote:
> Hitting this bug.
> Building a test package with commit from here:
> http://cgit.freedesktop.org/~danvet/drm-intel/commit/?h=drm-intel-next&id=4878cae22a2405b6d33318e2dc99a9c1367fee44

Chris, if you can make that test package available I'd be happy to give it a
whirl.

Fwiw, I seem to be able to work around this by locking my screen first, and then unplugging it. Need to unlock afterwards, obviously, but thought it might save people a few mins of resetting things...

Tom Haddon: This does not work for me. I lock screen (ctrl-alt-l), unplug monitor. Then screen corruption is visible. I put in password, and get back to corrupted desktop on one screen.
Plugging in the external monitor then eliminates the corruption.

Chris J Arges (arges) on 2013-04-02
Changed in xserver-xorg-video-intel (Ubuntu):
assignee: nobody → Chris J Arges (arges)
importance: Undecided → Medium

With external monitor plugged in, if I use the monitors control to turn off the external (rather than unplugging it), I get the same bug.

On Tue, Apr 02, 2013 at 05:53:51PM -0000, Christopher Barrington-Leigh wrote:
> With external monitor plugged in, if I use the monitors control to turn
> off the external (rather than unplugging it), I get the same bug.

I would add that the problem for me is not specific to changing the monitor
outputs. Changing the resolution of the LVDS1 output (as I've been doing
lately to try to develop a bugfix for compiz window placement) also triggers
the same kind of corruption.

Chris J Arges (arges) wrote :

I've tested this build with the following on my Thinkpad T420:
1) Plugged in VGA cable to external monitor, waited monitors to set proper resolution.
2) Unplugged VGA cable, waited for LVDS1 to be set to proper resolution.
3) Repeated 1/2 a few times.
4) Suspended and resumed.
5) Repeated 1/2 a few times.
This works for me. If somebody else can confirm, I'll get this applied to raring, and SRUed to other series.

Steve Langasek (vorlon) wrote :

$ uname -a
Linux virgil 3.8.0-16-generic #26~lp1157678v20130402 SMP Tue Apr 2 12:55:49
CDT 2013 x86_64 x86_64 x86_64 GNU/Linux

I'm sorry to report that this doesn't fix the problem for me. I still usually see corruption as in the attached photo on one or the other of the outputs when I change the monitor configuration. sometimes, cycling through the configs clears it; other times it does not.

Steve Langasek (vorlon) wrote :

(please be advised if you are viewing the provided image in a browser that it may be rotated 180 degrees. Phone cameras are not designed for left-handed people, and not all image viewers are smart enough to cope with the exif metadata. :P)

Chris Wilson (ickle) wrote :

If you could double check the kernel you tested had the patch, that would be fantastic. And you can install drm-intel-nightly from ppa:mainline to confirm the bug is fixed somewhere between 3.8 and now.

Ubuntu QA Website (ubuntuqa) wrote :

This bug has been reported on the Ubuntu laptop testing tracker.

A list of all reports related to this bug can be found here:
http://laptop.qa.ubuntu.com/qatracker/reports/bugs/1157678

tags: added: laptop-testing
Steve Langasek (vorlon) wrote :

Chris, I have no way to double-check that the kernel I tested had the patch, I only have the binary build that Chris Arges posted so I have to take his word for it.

I can certainly give the drm-intel-nightly kernel a try. Can you confirm that http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly/ is the kernel you mean?

Chris J Arges (arges) wrote :

@vorlon
Yes, the binary build I provided had 4878cae22a2405b6d33318e2dc99a9c1367fee44 cleanly cherry picked.
From the uname you had the kernel I built installed.

@ickle
Would there be other patches that would also be required, or should be included to be backported to 3.8?

I've tested the mainline kernel back on 3/26, and it didn't resolve my issues for me. But that particular mainline build didn't include the above fix which I had to pull from the drm-intel git tree.

I'll try the drm intel nightly as well.

Chris Wilson (ickle) wrote :

There were quite a few pageflip related fixes, the one I picked out was the one that found by my reverse bisection.

Chris J Arges (arges) wrote :

@vorlon,
Not sure what exactly changed but I have been unable to reproduce with 3.8.0-17.27. But this really doesn't make sense because the above patch was not applied to this release.

I plugged in an external monitor and unplugged it 10 times, and suspended and repeated and was unable to get the original graphic glitching. I'll continue to check this.

Steve Langasek (vorlon) wrote :

@ickle, I've just reproduced the issue with the drm-intel-next build from http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly/2013-04-06-raring/:

$ uname -a
Linux virgil 3.9.0-994-generic #201304060409 SMP Sat Apr 6 08:10:44 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
$

So this problem doesn't appear to be fixed.

Chris J Arges (arges) wrote :

I can reproduce this issue using:
    xrandr --output DP1 --auto --primary --output LVDS1 --off
With 3.8.0-17-generic and drm-intel-next (3.9.0-997) after plugging in an external monitor.

Perhaps this is a different issue than the one reported on freedesktop.

Created attachment 77747
Glitchy Output

If I attach an external monitor to my Lenovo Thinkpad T420 and type the following command:
    xrandr --output VGA1 --auto --primary --output LVDS1 --off
I get glitchy output as shown in the attached picture.

I have tested with a pre-built Ubuntu kernel here:
http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-next/current/
This should be commit bae3699182027525d92b97d904578a533264b242 from the drm-intel tree next branch.

The related launchpad bug is here:
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1157678

While https://bugs.freedesktop.org/show_bug.cgi?id=57160 is linked in the bug, I believe this is another issue. I've tested with a kernel that contains the patch and it did not fix the issue.

My graphics card is:
2nd Generation Core Processor Family Integrated Graphics Controller [8086:116]

Ah, a victim for testing a few patches of mine!

Please test the pll-limits-mess branch from http://cgit.freedesktop.org/~danvet/drm

Daniel,
I've tested with the pll-limits-mess branch, and it resulted in the same output.

Steve Langasek (vorlon) wrote :

As Chris Arges and I are both still seeing this symptom with all the kernels, and Christopher (the original bug submitter) hasn't commented on whether he still sees the problem, I'm going to un-dupe bug #1155838 from this one.

Yes, I still see it on the latest main-stream updates for 13.04. I haven't tried a special PPA.
Thanks,
c

Can you please attach a drm.debug=6 dmesg and Xorg.log from the latest run? Also the output of intel_reg_dumper would be useful.

Created attachment 77782
dmesg (drm.debug=6)

Created attachment 77783
Xorg.0.log

Created attachment 77784
intel_reg_dump

And the other one is 'cat /sys/kernel/debug/dri/0/i915_gem_framebuffers'

Created attachment 77787
i915_gem_framebuffer

So what I thought might have been happening was that we reused an incorrectly sized cached framebuffer. Not so sure after reading i915_gem_framebuffer. However, I've implemented a defense against using the wrong framebuffer size:

commit 9dae6f9f1f169c228929185a8bd94e82afe92574
Author: Chris Wilson <email address hidden>
Date: Fri Apr 12 11:01:08 2013 +0100

    sna: Flush the scanout cache after resizing the display

    And ensure that any new scanout allocations make the requested size.

It will be worth updating xf86-video-intel.git and seeing if that makes any difference (will be in ppa:xorg-edgers in a few hours).

Building and installing latest xf86-video-intel.git with HEAD: http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=f0b6ae2cfb811a8c234634c878800ca1fb95703f

Seems to resolve the issue, I've tested the original issue of unplugging and plugging in an external monitor, suspending and resuming, and also the xrandr trigger. So far no screen corruption!

Changed in xserver-xorg-video-intel:
importance: Medium → Unknown
status: Fix Released → Unknown
Chris J Arges (arges) wrote :

Building and installing latest xf86-video-intel.git with HEAD: http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=f0b6ae2cfb811a8c234634c878800ca1fb95703f

Seems to resolve the issue, I've tested the original issue of unplugging and plugging in an external monitor, suspending and resuming, and also the xrandr trigger. So far no screen corruption.

Changed in xserver-xorg-video-intel:
importance: Unknown → Medium
status: Unknown → Incomplete
Chris J Arges (arges) wrote :

This patch is a cherry pick of 9dae6f9f1f169c228929185a8bd94e82afe92574.
There was only a minor conflict and it was fixed. To build this, you must have also resolved the issue with including valgrind.h.

summary: - unplugging an external monitor from laptop results in corrupted screen.
- Logging out fixes it.
+ [ffe] unplugging an external monitor from laptop results in corrupted
+ screen. Logging out fixes it.

The attachment "0001-sna-Flush-the-scanout-cache-after-resizing-the-displ.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Chris Wilson (ickle) on 2013-04-16
Changed in xserver-xorg-video-intel (Ubuntu):
status: In Progress → Fix Committed
Changed in xserver-xorg-video-intel:
status: Incomplete → Fix Released
Timo Aaltonen (tjaalton) wrote :

I happened to reproduce this myself while testing another bug, so if you don't mind I'll just steal the bug :)

committed the patch to my local git tree, will push later today.

Changed in xserver-xorg-video-intel (Ubuntu):
assignee: Chris J Arges (arges) → Timo Aaltonen (tjaalton)
Chris J Arges (arges) on 2013-04-19
Changed in xserver-xorg-video-intel (Ubuntu):
assignee: Timo Aaltonen (tjaalton) → Chris J Arges (arges)
Bryce Harrington (bryce) wrote :

The original patch is large due to some code refactoring. Here is a more minimal patch that just shows the functional changes without the refactoring.

You can see the patch:
  a) Adds a routine to check the scanout size
  b) Adds code to cleanup the large_inactive and scanout lists in a couple places
  c) Loops the check for pending events

Bryce Harrington (bryce) on 2013-04-19
description: updated
Scott Kitterman (kitterman) wrote :

I just accepted xserver-xorg-video-intel, but put a britney block in place for it. It looks like a good, but not essential for release fix. If there's no opportunity to copy it in, then it can be an SRU.

Scott Kitterman (kitterman) wrote :

Now that the package is built in raring-proposed, could someone who is experiencing the problem verify that the update solves the problem (like we would do for an SRU)?

Chris J Arges (arges) wrote :

I've installed the package and it fixes the issue! I used the original xrandr command I used to reproduce the issue as well as plugged in and out the external VGA connector. I also suspended my laptop and repeated the tests.

One note, the changelog should say [ Chris Wilson], instead of my name since he's the author of the patch. I merly produced the initial backport, and Bryce cleaned up and produced the final backport.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package xserver-xorg-video-intel - 2:2.21.6-0ubuntu4

---------------
xserver-xorg-video-intel (2:2.21.6-0ubuntu4) raring-proposed; urgency=low

  [Chris Arges]
  * Add sna-flush-scanout-cache-after-resizing.patch: Flush the scanout
    cache after resizing the display. Fixes a problem that occurs
    e.g. when unplugging an external display, suspend/resume, etc. by
    ensuring the scanout cache is properly sized.
    (LP: #1157678)
 -- Bryce Harrington <email address hidden> Fri, 19 Apr 2013 11:12:14 -0700

Changed in xserver-xorg-video-intel (Ubuntu):
status: Fix Committed → Fix Released

It arrived in raring updates and works on my system. Thank you.

Raphaël Badin (rvb) wrote :

> It arrived in raring updates and works on my system. Thank you.

Same here, thanks!

Martin Vysny (vyzivus) wrote :

I confirm that the patch fixes the issue for me. Thanks!

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.