Lenovo T400: With external monitor the system is rarely working for more than 10 minutes

Bug #1116587 reported by Felix Möller on 2013-02-05
This bug affects 2 people
Affects Status Importance Assigned to Milestone
compiz (Ubuntu)
linux (Ubuntu)
Timo Aaltonen
xserver-xorg-video-intel (Ubuntu)

Bug Description

I have a Lenove Thinkpad T400 with an external monitor (1680x1050) connected.

Switching windows, surfing the web, and using the compiz effect that streteches windows when moving them against the border cause my system to freeze.

I can still move the mouse, however, nothing is clickable at all.

Hoped that todays update to 2.21.0-0ubuntu1 solved, the issue but froze allready twice in 30 minutes. There is nothing in the logs AFAICS.

ProblemType: Bug
DistroRelease: Ubuntu 13.04
Package: xserver-xorg-video-intel 2:2.21.0-0ubuntu1
ProcVersionSignature: Ubuntu 3.8.0-4.8-generic 3.8.0-rc6
Uname: Linux 3.8.0-4-generic x86_64

ApportVersion: 2.8-0ubuntu4
Architecture: amd64
CompizPlugins: No value set for `/apps/compiz-1/general/screen0/options/active_plugins'
CompositorRunning: compiz
Date: Tue Feb 5 20:22:28 2013
DistUpgraded: 2013-01-23 07:16:01,014 DEBUG enabling apt cron job
DistroCodename: raring
DistroVariant: ubuntu
ExtraDebuggingInterest: Yes
 Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller [8086:2a42] (rev 07) (prog-if 00 [VGA controller])
   Subsystem: Lenovo Device [17aa:20e4]
   Subsystem: Lenovo Device [17aa:20e4]
InstallationDate: Installed on 2012-03-31 (311 days ago)
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Beta amd64 (20120331)
MachineType: LENOVO 6474A46
MarkForUpload: True
 Socket 0:
   no product info available
 Socket 0:
   no card
 PATH=(custom, no user)
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.8.0-4-generic root=UUID=b32d85c9-d1fb-49ca-8c94-c64d321221b3 ro quiet splash vt.handoff=7
SourcePackage: xserver-xorg-video-intel
UpgradeStatus: Upgraded to raring on 2013-01-23 (13 days ago)
dmi.bios.date: 10/17/2012
dmi.bios.vendor: LENOVO
dmi.bios.version: 7UET94WW (3.24 )
dmi.board.name: 6474A46
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr7UET94WW(3.24):bd10/17/2012:svnLENOVO:pn6474A46:pvrThinkPadT400:rvnLENOVO:rn6474A46:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 6474A46
dmi.product.version: ThinkPad T400
dmi.sys.vendor: LENOVO
version.compiz: compiz 1:0.9.9~daily13.02.04-0ubuntu1
version.ia32-libs: ia32-libs 20090808ubuntu36
version.libdrm2: libdrm2 2.4.41-0ubuntu1
version.libgl1-mesa-dri: libgl1-mesa-dri 9.0.2-0ubuntu1
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A
version.libgl1-mesa-glx: libgl1-mesa-glx 9.0.2-0ubuntu1
version.xserver-xorg-core: xserver-xorg-core 2:1.13.2-0ubuntu1
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.7.3-0ubuntu2
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:7.1.0-0ubuntu1
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.21.0-0ubuntu1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.6-0ubuntu2
xserver.bootTime: Tue Feb 5 20:17:49 2013
xserver.configfile: default

xserver.logfile: /var/log/Xorg.0.log
xserver.version: 2:1.13.2-0ubuntu1
xserver.video_driver: intel

Related branches

CVE References

Felix Möller (felix-derklecks) wrote :

So SNA is enabled by default now. Do the freezes still occur when switching back to UXA accel method?

Felix Möller (felix-derklecks) wrote :

Thanks for the hint.

I have added the following file:
# cat /usr/share/X11/xorg.conf.d/60-monitor.conf
Section "Device"
  Identifier "Device0"
  Driver "intel"
  Option "AccelMethod" "uxa"

will test tomorrow.

summary: - With external monitor the system is rarely working for more than 10
- minutes
+ Lenovo T400: With external monitor the system is rarely working for more
+ than 10 minutes
bugbot (bugbot) on 2013-02-07
tags: added: dual-head
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in compiz (Ubuntu):
status: New → Confirmed
Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Confirmed
Felix Möller (felix-derklecks) wrote :

Ok after continued testing, I come to the conclusion that the EXA SNA difference does not make any difference here.

I had my system running on EXA crash 4 times in the last 30 minutes. But there is nothing in /sys/kernel/debug/dri/0/i915_error_state not /var/log/Xorg.0.log.

Any help?

Robert Hooker (sarvatt) wrote :

I'm not seeing any errors in the logs, can you attach ~/.xsession-errors after reproducing it in case it gives some info on why compiz hung?

some notes from irc, thanks for that!

<bryce> fm, you're right - nothing in your Xorg.0.log or dmesg. And if there's no i915_error_state file then it's sounding like it's not "just" a GPU lockup or xserver crash
<bryce> even possible it's not X locking up

<bryce> fm, check .xsession-errors, /var/crash for any .crash files, and /var/log/syslog
<fm> no apport popping up at reboot so no .crash, and syslog empty as well

<bryce> fm, is it exactly regular 10 min, or just more or less?
<fm> bryce, no just more or less

<bryce> fm, can you trace to when that first started and what you did leading up to it? or has it been like that since install?
<fm> bryce, happens since running 13.04, was not there with 12.10
<fm> right now i am on 2.21.2 or so
<bryce> fm, and it started right after upgrading?
<fm> bryce, yes happend right after upgrading

<bryce> fm, hmm well no idea what's wrong, but can throw out some advice for tracking it down more
<bryce> fm, I haven't run across reports quite like this, that sound like a GPU lockup but don't have evidence in the logs
<bryce> fm, so something must be a bit unique about your system. So you might doubly think about anything you've installed or done that other people wouldn't have done. or anything unusual about the hardware that others might not have.
<bryce> fm, fwiw, until just recently, this cycle for 13.04 we haven't changed much on X compared with 12.10, aside from targeted bug fixes. Same xserver, drivers, etc.
<bryce> fm, so issues that crop up on upgrade to 13.04 that didn't happen in 12.10, we often suggest looking first at the kernel or other things, before X, which did receive big bumps.
<bryce> so, like try installing and booting the quantal kernel, and see if you can rule that out

<bryce> and ssh in while it's frozen and look for any files changed in /var/log/*
<fm> what is the best way to get it?
<fm> bryce, yeah, can just go to ctrl-alt-1
<bryce> fm, think you can download the .deb off launchpad. kernel has no dependencies so pretty much grab from whereever's convenient.
<bryce> oho, if you can vt switch, that suggests not a gpu lockup
<bryce> fm, have you been able to just `sudo service lightdm restart` to get it working again? Or does it require a full reboot?
<bryce> fm, ok gotta go work on some other stuff, but good luck. If you can copy some of the stuff we discussed onto the bug, it'll help fill in whomever looks at it next.
<fm> bryce, will try lightdm restart

It again took just a few minutes to crash the system.

# sudo service lightdm restart
allows me to log in again.

.xsession errors does not contain anything relevant.

Patrick Hetu (patrick-hetu) wrote :

I have the same bug on a lenovo X61 with lastest raring updates:

  00:02.0 VGA compatible controller: Intel Corporation 82G965 Integrated Graphics Controller

I can also restart lightdm from vt and I've tried with older kernels (like linux-image-3.5.0-21-generic) but no luck.
It crash every 1-3 hours on various actions like scrolling in chromium, virtual desktop switching, opening a terminal...
I'll see if I can find a pattern.

Patrick Hetu (patrick-hetu) wrote :

I'm also working a on an external monitor.

One information that might help anybody affected. Just using the external monitor solves the issue for me.

My external is 1680x1050 while the internal one is 1280x800.

Chris Wilson (ickle) wrote :

One thing you can try is building the driver from git with debugging enabled:

$ apt-get build-dep xserver-xorg-video-intel
$ git clone git://anongit.freedesktop.org/xorg/driver/xf86-video-intel
$ cd xf86-video-intel
$ ./autogen.sh --prefix=/usr --with-default-accel=sna --enable-debug
$ make && sudo make install

Restart X and you should have a line in your Xorg.log saying
"(II) intel(0): SNA compiled with assertions enabled"

See if that catches anything - if you get spontaneous restarts make sure to attach the gdm.log in order to report the failed assertions.

INstalled the driver. It crashed after a minute. However, nothing in the logs as far as I can see.

Robert Hooker (sarvatt) wrote :

the info will be in /var/log/lightdm/x-0.log if you can attach that Felix

Robert Hooker (sarvatt) wrote :

When it crashes, switch to a vt, sudo cp /var/log/lightdm/x-0.log{,.bak} so it doesn't get overwritten, and attach /var/log/lightdm/x-0.log.bak please :)

Chris Wilson (ickle) wrote :

A simple thing to test would be

echo > /etc/X11/xorg.conf.d/intel.conf <<EOF
Section "Device"
  Identifier "Device0"
  Driver "intel"
  Option "TripleBuffer" "false"

and see if the hangs still occur.

Patrick Hetu (patrick-hetu) wrote :

Chris, I've tried your previous configuration and X frooze right when I started chromium.
Usally it take like 3-4 hours.

Scrolling http://planet.gnome.org/ as fast as I can -- I have a logitech mouse with wheel with no friction.

The system crashes within 45 seconds when disabling the "TrippleBuffer" for me.

Chris Wilson (ickle) wrote :

Wow. Thanks for testing that, that rules out an easy workaround and leaves me a little more puzzled. Yet another pageflip race in the kernel still seems the most likely explanation.

What can I do to confirm or rule out such a page flip bug? There is nothing in dmesg...

Might this be realted to bug #1097315? I got the updated kernel 20 minutes ago and so far my system has not become unaccessible....

Ok, this is just an improvement but no fix. The bug occured after 15 minutes of heavy usage.

Testing with kernel 3.5.0-22-generic now, will report back later

I have now tested 3.5.0-22-generic for the whole evening and it has not crashed so far. Thus I expect the kernel to be involved in this issue.

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Chris Wilson (ickle) wrote :

Another test you can try is with the 3.8 kernel is:

Option "SwapbuffersWait" "false"

which will disable pageflipping (as well as disabling vsync) altogether.

Patrick Hetu (patrick-hetu) wrote :

Chris, so far disabling SwapbuffersWait fix it for me.
Also, I have a really old monitor could it be the cause of the crash?

I think SwapbuffersWait fixes it for me as well. So far it has not crashed for a day of light usage... What can we do to narrow down the problem?

Chris Wilson (ickle) wrote :

That confirms that pageflipping is the direct cause of the hangs. Very recently suspicion has been raised that the gen4 pageflipping irq handling may be racy - which would help explain why all the reports of this hang I've seen so far have been gen4.

Chris Wilson (ickle) wrote :

I think the fixed version of https://patchwork.kernel.org/patch/2159201/ will be interesting to test for this case.

Chris Wilson (ickle) wrote :

Ok, the patch has landed:

commit 21ad833075801a7cd81b5ef1604ffc6c600e5ff9
Author: Ville Syrjälä <email address hidden>
Date: Tue Feb 19 15:16:39 2013 +0200

    drm/i915: Fix races in gen4 page flip interrupt handling

Can you please try drm-intel-experimental (from the mainline ppa) when it is updated, or drm-intel-next directly?

Chris Wilson (ickle) on 2013-02-22
Changed in compiz (Ubuntu):
status: Confirmed → Invalid
Changed in xserver-xorg-video-intel (Ubuntu):
status: Confirmed → Fix Committed
Changed in linux (Ubuntu):
status: Confirmed → Fix Committed

Sorry, I was occupied the last weeks. Has this gone into raring-proposed? What exactly do I have to install to test it?

Chris Wilson (ickle) wrote :

I don't think so, maybe I can get one of the Ubuntu guys to chase up... For the time being, you can find a kernel packaged here http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly/current/ with the fixes.

This seems to be fixed with the drm-intel-nightly kernel! Thanks!

Timo Aaltonen (tjaalton) on 2013-03-12
Changed in linux (Ubuntu):
assignee: nobody → Timo Aaltonen (tjaalton)
status: Fix Committed → In Progress
Tim Gardner (timg-tpi) on 2013-03-18
Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.8.0-13.23

linux (3.8.0-13.23) raring; urgency=low

  [ Upstream Kernel Changes ]

  * Revert "drm/i915: enable irqs earlier when resuming"
    - LP: #1156310
  * Revert "drm/i915: reorder setup sequence to have irqs for output setup"
    - LP: #1156310
  * x86/apic: Remove noisy zero-mask warning from
    - LP: #1100202
  * drm/i915: Fix races in gen4 page flip interrupt handling
    - LP: #1116587
  * drm/i915: Revert hdmi HDP pin checks
    - LP: #1135668
  * signal: always clear sa_restorer on execve
    - LP: #1153813
    - CVE-2013-0914
 -- Tim Gardner <email address hidden> Mon, 18 Mar 2013 10:04:33 -0600

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Chris Wilson (ickle) on 2013-03-20
Changed in xserver-xorg-video-intel (Ubuntu):
status: Fix Committed → Fix Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

To post a comment you must log in.