Disabling an output can cause vblank events to be missed

Bug #740126 reported by Chris Coulson on 2011-03-22
582
This bug affects 110 people
Affects Status Importance Assigned to Milestone
Unity
Fix Released
High
Chris Halse Rogers
compiz (Ubuntu)
High
Chris Halse Rogers
Natty
High
Chris Halse Rogers
linux (Ubuntu)
High
Canonical Kernel Team
Natty
High
Canonical Kernel SRU Team
unity (Ubuntu)
High
Unassigned
Natty
High
Unassigned

Bug Description

When an output gets disabled - by being switched off by DPMS, unplugged, etc - it's possible for applications to have pending vblank events waiting on this output. If this occurs, the application will never receive the vblank event. This manifests as the app appearing to hang (in poll() on the X connection, if you attach GDB).

Binary package hint: compiz

This seems to happen when I'm not at my laptop (ie, when the screensaver has activated usually). When I return to my laptop, I find it in a state which appears to be completely frozen (ie, nothing gets repainted on the screen). However, I can switch to a console and things appear to be working normally. I can send a SIGKILL to compiz and start another WM, and then everything starts working normally again (although if I restart compiz, it often crashes until I've restarted my laptop).

If I attach gdb to the hung compiz, I always see a trace which looks like this:

0x00007f53f3dc8e33 in __poll (fds=<value optimised out>, nfds=<value optimised out>, timeout=<value optimised out>) at ../sysdeps/unix/sysv/linux/poll.c:87
87 ../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
 in ../sysdeps/unix/sysv/linux/poll.c
(gdb) btbt full
#0 0x00007f53f3dc8e33 in __poll (fds=<value optimised out>, nfds=<value optimised out>, timeout=<value optimised out>) at ../sysdeps/unix/sysv/linux/poll.c:87
        resultvar = 18446744073709551100
        oldtype = 0
        result = <value optimised out>
#1 0x00007f53f738d512 in _xcb_conn_wait (c=0x1b79060, cond=<value optimised out>, vector=0x0, count=0x0) at ../../src/xcb_conn.c:313
        ret = <value optimised out>
        fd = {fd = 3, events = 1, revents = 0}
#2 0x00007f53f738eb3f in xcb_wait_for_reply (c=0x1b79060, request=390272, e=0x7fff56067648) at ../../src/xcb_in.c:378
        cond = {__data = {__lock = 0, __futex = 0, __total_seq = 0, __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0, __broadcast_seq = 0},
          __size = '\000' <repeats 47 times>, __align = 0}
        reader = {request = 390272, data = 0x7fff560675a0, next = 0x0}
        prev_reader = <value optimised out>
        widened_request = <value optimised out>
        ret = 0x0
#3 0x00007f53f75e188d in _XReply (dpy=0x1b77e10, rep=0x7fff560676a0, extra=0, discard=0) at ../../src/xcb_io.c:533
        req = 0x1b74780
        response = <value optimised out>
        error = 0x0
        c = 0x1b79060
        reply = <value optimised out>
        current = 0x1b74780
        __PRETTY_FUNCTION__ = "_XReply"
#4 0x00007f53efb6993b in DRI2WaitMSC (dpy=0x1b77e10, drawable=100, target_msc=0, divisor=2, remainder=1, ust=0x7fff56067728, msc=0x7fff56067720, sbc=0x7fff56067718) at dri2.c:616
        info = <value optimised out>
        req = <value optimised out>
        rep = {type = 64 '@', pad1 = 119 'w', sequenceNumber = 22022, length = 32767, ust_hi = 1443264328, ust_lo = 32767, msc_hi = 100, msc_lo = 0, sbc_hi = 2137562112,
          sbc_lo = 3119085452}
#5 0x00007f53efb67df1 in dri2WaitForMSC (pdraw=<value optimised out>, target_msc=<value optimised out>, divisor=<value optimised out>, remainder=<value optimised out>,
    ust=0x7fff56067778, msc=0x7fff56067770, sbc=0x7fff56067768) at dri2_glx.c:346
        dri2_ust = 29839872
        dri2_msc = 29878048
        dri2_sbc = 139998480560225
        ret = -516
#6 0x00007f53efb3f81f in __glXWaitVideoSyncSGI (divisor=2, remainder=1, count=0x7fff560677cc) at glxcmds.c:1775
        gc = 0x1c7e720
        psc = 0x1c77560
        pdraw = <value optimised out>
        ust = 2722
        msc = <value optimised out>
        sbc = 139998480562447
        ret = <value optimised out>
#7 0x00007f53efdb71c4 in PrivateGLScreen::waitForVideoSync() () from /usr/lib/compiz/libopengl.so
No symbol table info available.
#8 0x00007f53efdb737b in PrivateGLScreen::paintOutputs(std::list<CompOutput*, std::allocator<CompOutput*> >&, unsigned int, CompRegion const&) ()
   from /usr/lib/compiz/libopengl.so
No symbol table info available.
#9 0x00007f53effddb19 in CompositeScreen::paint(std::list<CompOutput*, std::allocator<CompOutput*> >&, unsigned int) () from /usr/lib/compiz/libcomposite.so
No symbol table info available.
#10 0x00007f53effdf930 in CompositeScreen::handlePaintTimeout() () from /usr/lib/compiz/libcomposite.so
No symbol table info available.
#11 0x00000000004219ff in CompTimeoutSource::callback() ()
No symbol table info available.
#12 0x000000000042146d in CompTimeoutSource::dispatch(sigc::slot_base*) ()
No symbol table info available.
#13 0x00007f53f5c649df in Glib::Source::dispatch_vfunc(_GSource*, int (*)(void*), void*) () from /usr/lib/libglibmm-2.4.so.1
No symbol table info available.
#14 0x00007f53f5114bcd in g_main_dispatch (context=0x1ba7730) at /build/buildd/glib2.0-2.28.3/./glib/gmain.c:2440
        dispatch = 0x7f53f5c64990 <Glib::Source::dispatch_vfunc(_GSource*, int (*)(void*), void*)>
        was_in_call = 0
        user_data = 0x1c507a0
        callback = 0x7f53f5c64b40
        cb_funcs = 0x7f53f53bf630
        cb_data = 0x1c45fe0
        current_source_link = {data = 0x1c50730, next = 0x0}
        need_destroy = <value optimised out>
        source = 0x1c50730
        current = 0x1ba7e90
        i = <value optimised out>
#15 g_main_context_dispatch (context=0x1ba7730) at /build/buildd/glib2.0-2.28.3/./glib/gmain.c:3013
No locals.
#16 0x00007f53f51153a8 in g_main_context_iterate (context=0x1ba7730, block=<value optimised out>, dispatch=1, self=<value optimised out>)
    at /build/buildd/glib2.0-2.28.3/./glib/gmain.c:3091
        max_priority = 2147483647
        timeout = 11
        some_ready = 1
        nfds = 14
        allocated_nfds = <value optimised out>
        fds = <value optimised out>
#17 0x00007f53f51159f2 in g_main_loop_run (loop=0x1c50880) at /build/buildd/glib2.0-2.28.3/./glib/gmain.c:3299
        __PRETTY_FUNCTION__ = "g_main_loop_run"
#18 0x0000000000429fba in CompScreen::eventLoop() ()
No symbol table info available.
#19 0x0000000000422f70 in main ()

This is happening fairly often for me (3-4 times per day)

Didier Roche (didrocks) wrote :

I don't get that so often, but I get some as well. Setting on the priority list.

Changed in compiz (Ubuntu):
status: New → Triaged
importance: Undecided → High
tags: added: unity-priority
Bernhard Schmidt (berni) wrote :

I have a similar bug. In my case the screen stays black except for the mouse cursor. It is not frozen though, I can move the mouse cursor just fine and it changes shape when I get in the middle of the screen (over the invisible password prompt). When I blindly enter my password the screen gets unlocked, as I can see by moving the cursor around (it changes shape according to the applications on my invisible workspace). Only way to regain access so far was to kill the session.

Can any of you still move your mouse when it happens?

Didier Roche (didrocks) on 2011-03-24
tags: added: unity
Bernhard Schmidt (berni) wrote :

Not sure this is unity-related, I see this with the classic 3D desktop as well.

Chris Coulson (chrisccoulson) wrote :

From IRC a few days ago:

[11:33] <chrisccoulson> RAOF, if you're interested, here's the compiz hang i was talking about yesterday - bug 740126 ;)
[11:33] <ubot2> Launchpad bug 740126 in compiz "compiz hangs randomly several times per day" [Undecided,New] https://launchpad.net/bugs/740126
[11:33] <chrisccoulson> not sure if that's an X issue or not
[11:33] <chrisccoulson> i got a trace from compiz
[11:35] <RAOF> chrisccoulson: Oh, boo. That's probably pageflipping problem :(

Changed in compiz (Ubuntu):
status: Triaged → Incomplete
Chris Halse Rogers (raof) wrote :

Next time someone encounters this bug, could you please attach the output of “sudo intel_gpu_dump”?

Chris Coulson (chrisccoulson) wrote :

Here we go :)

David Barth (dbarth) on 2011-03-28
tags: removed: unity-priority
Chris Halse Rogers (raof) wrote :

That looks to me awfully like an entirely idle gpu.

My hypothesis is that the driver has missed the vblank interrupt, so the MSC compiz is waiting on will never be raised.

Could you please add ‘drm.debug=0x1’ to your kernel command line and attach dmesg next time this happens? That should spit out appropriate messages to debug this.

Chris Coulson (chrisccoulson) wrote :

I just recreated this with drm.debug=0x1. The log is huge though, so it might be pretty difficult for you to parse.

Note, this log contains a few days of information in it, probably the only relevant bit is from today (Apr 19th). I managed to do it within 5 minutes or so of booting up (it's quite easily reproducible here by closing and opening my laptop lid a few times)

Chris Halse Rogers (raof) wrote :

Hm. Jason Smith has brought up a bug with similar symptoms to this - hanging in poll from _XReply - which suggests that this might not actually be a driver problem at all, but a more general X/libX11 bug.

Chris Coulson (chrisccoulson) wrote :

I guess the next thing for me to try when it happens is to attach gdb to X to see what it is doing, perhaps?

 status confirmed
 importance high
 assign me
 milestone ubuntu-11.04

I can reproduce this easily, both on Intel and Radeon hardware.

I'm not sure yet where in the stack this problem is - it could be X,
libX11, or compiz - but to the user it appears as a system-freeze and
it's reasonably easy to trigger. Milestoning to 11.04; we shouldn't
release with this bug.

Changed in compiz (Ubuntu):
milestone: none → ubuntu-11.04
status: Incomplete → Confirmed

I am seeing this on two laptops. It happens only(?) when flash video is playing and display switches off/goes to sleep. And not every time. One thing to note is that switching to vt1 and
#export DISPLAY=":0.0"; killall -9 compiz; compiz --replace &
fixes problem - compiz comes up and display is fully functional. So my guess is that it`s compiz.

Didier Roche (didrocks) wrote :

We had killed one of them in latest unity release (3.8.6), is it still happening with this?

The only time I get some X freeze now is when my hard drive is writing like crazy (seems chromium and firefox doing that), and no refresh of compiz meanwhile…

Chris Halse Rogers (raof) wrote :

Yeah. Jason has worked around one particularly easy to trigger hang, but it's a more general problem and there are still edges to hit.

Changed in compiz (Ubuntu):
assignee: nobody → Chris Halse Rogers (raof)
Niloc Deeps (ajarncolin) wrote :

Ubuntu 11.04:
Compiz generally only crashes when I run a pyqt gui programme from Python. I'm just left with the desktop background (with no ability to report the problem)and have to manually switch off.
Hope this helps to pin-point the problem.

Chris Coulson (chrisccoulson) wrote :

Note, I can reproduce this quite easily by closing and opening the lid on my laptop a few times

Jason Smith (jassmith) wrote :

Can you disable vsync and see if it still happens?

jamson (kains) wrote :

disabling vsync does not help. tried that before. Other thing to note - there are also some artifacts - for example adding bookmark in Chromium by clicking star icon and then "Done", sometimes leaves gray menu shape, that is not removed until content is scrolled.

Jason Smith (jassmith) wrote :

Can you disable vsync and then get another trace. It will be different.

David Barth (dbarth) wrote :

@jamson: those artifacts should be fixed now with the compiz version you can find in our daily builds at https://launchpad.net/~unity/+archive/daily

Chris Halse Rogers (raof) wrote :

I think I've discovered an underlying cause of these hangs - compiz taking a server grab and not flushing its command buffers - and have pointed DX at the fix.

To confirm this hypothesis, next time you see this freeze could you attach gdb to the *X* process (with “sudo gdb Xorg $(pgrep X)”) and give me the result of “print grabState”?

Changed in compiz (Ubuntu Natty):
status: Confirmed → Incomplete
Didier Roche (didrocks) on 2011-04-15
Changed in compiz (Ubuntu Natty):
status: Incomplete → Fix Committed
assignee: Chris Halse Rogers (raof) → Sam "SmSpillaz" Spilsbury (smspillaz)
Changed in unity:
status: New → Fix Committed
importance: Undecided → High
assignee: nobody → Sam "SmSpillaz" Spilsbury (smspillaz)
milestone: none → 3.8.10
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package compiz - 1:0.9.4+bzr20110415-0ubuntu1

---------------
compiz (1:0.9.4+bzr20110415-0ubuntu1) natty; urgency=low

  * New upstream snapshot
    - Focus problem with Thunderbird (LP: #753951)
    - Chromium fullscreen + Alt-TAB confuses the launcher (LP: #757434)
    - compiz hangs randomly several times per day (LP: #740126)
  * debian/patches/00_*:
    - removed as part of upstream tarball
 -- Didier Roche <email address hidden> Fri, 15 Apr 2011 17:08:40 +0200

Changed in compiz (Ubuntu Natty):
status: Fix Committed → Fix Released
Didier Roche (didrocks) on 2011-04-15
Changed in unity (Ubuntu Natty):
status: New → Fix Committed
Chris Coulson (chrisccoulson) wrote :

Note, this has happened 3 times this afternoon. grabState is 0 when it happens. Not sure if that's what you expect ;)

Chris Halse Rogers (raof) wrote :

Darn. I think that demonstrates that this is a separate bug. There certainly *were* corner cases with grabs being hit, but yours doesn't appear to be one.

Oh, well. It was a nice weekend with the delusion of this being fixed!

Jason Smith (jassmith) wrote :

Chris, can you confirm this also happens with vsync disabled? That is the only case I can think of where the grab shouldn't be relevant anyhow.

Chris Halse Rogers (raof) wrote :

Chris: I notice that from your kernel log it looks like the driver is scanning for vblank changes on two CRTCs - do you have a dual-monitor setup, and if so, what are its details?

Didier Roche (didrocks) on 2011-04-18
Changed in unity:
status: Fix Committed → Triaged
Changed in compiz (Ubuntu Natty):
status: Fix Released → Triaged
Changed in unity (Ubuntu Natty):
status: Fix Committed → Triaged
importance: Undecided → High
Giovanni Mellini (merlos) wrote :

This happens to me too often, sometimes when i lock the screen.
How I can help debugging?

Giovanni Mellini (merlos) wrote :

Just got a new crash now, while using Chrome (dev channel). I forgot to say that after the crash I get the gdm login page.

Last lines I see on syslog just after the crash are

Apr 18 12:25:47 shrimp gdm-simple-slave[8578]: WARNING: Unable to load file '/etc/gdm/custom.conf': File o directory non esistente
Apr 18 12:25:47 shrimp acpid: client 7101[0:0] has disconnected
Apr 18 12:25:47 shrimp acpid: client connected from 8581[0:0]
Apr 18 12:25:47 shrimp acpid: 1 client rule loaded
Apr 18 12:25:49 shrimp gdm-session-worker[8625]: WARNING: Unable to load file '/etc/gdm/custom.conf': File o directory non esistente
Apr 18 12:25:49 shrimp rtkit-daemon[1430]: Successfully made thread 8630 of process 8630 (n/a) owned by '106' high priority at nice level -11.
Apr 18 12:25:49 shrimp rtkit-daemon[1430]: Supervising 1 threads of 1 processes of 1 users.
Apr 18 12:25:49 shrimp rtkit-daemon[1430]: Successfully made thread 8632 of process 8630 (n/a) owned by '106' RT at priority 5.
Apr 18 12:25:49 shrimp rtkit-daemon[1430]: Supervising 2 threads of 1 processes of 1 users.
Apr 18 12:25:49 shrimp rtkit-daemon[1430]: Successfully made thread 8633 of process 8630 (n/a) owned by '106' RT at priority 5.
Apr 18 12:25:49 shrimp rtkit-daemon[1430]: Supervising 3 threads of 1 processes of 1 users.
Apr 18 12:25:50 shrimp gdm-simple-greeter[8623]: Gtk-WARNING: /build/buildd/gtk+2.0-2.24.4/gtk/gtkwidget.c:5687: widget not within a GtkWindow
Apr 18 12:25:50 shrimp gdm-simple-greeter[8623]: WARNING: Unable to load CK history: no seat-id found

David Barth (dbarth) wrote :

To clarify: I think the bug is now at the Xlib/driver level, not so much in Unity or Compiz anymore. Sam stays around if there is something to do at the compiz or unity level to find a workaround.

Changed in compiz (Ubuntu Natty):
assignee: Sam "SmSpillaz" Spilsbury (smspillaz) → Chris Halse Rogers (raof)
Changed in unity:
assignee: Sam "SmSpillaz" Spilsbury (smspillaz) → Chris Halse Rogers (raof)
Didier Roche (didrocks) on 2011-04-19
Changed in unity:
milestone: 3.8.10 → 3.8.12
Chris Halse Rogers (raof) wrote :

Ok. I

Chris Halse Rogers (raof) wrote :

Ahem. I've traced this locally and it doesn't *seem* to be a compiz problem. Compiz is waiting for the next frame, and X is waiting for the kernel to tell it the next frame has happened.

Adding a kernel task, as it looks like that's where the problem lies.

Chris Halse Rogers (raof) wrote :

Ok. So, for a work around I think we can start compiz with the vblank_mode=0 environment set and vsync disabled; this will stop compiz being blocked waiting on vsync, but in return allow some tearing to occur.

With vblank_mode=0 I can toggle dpms on/off with a GL screensaver running - which seems to be a trigger for this - until my GPU locks up. Which is another bug, not this one :/.

funicorn (funicorn) wrote :

No, the bug still appears in compiz (1:0.9.4+bzr20110415-0ubuntu2) natty. I am working on Ubuntu natty Unity
with compiz (1:0.9.4+bzr20110415-0ubuntu2. And my system halted a lot of times everyday.

I am using the experimental nouveau driver. Here is my backtrace:

[ 22727.289] [mi] EQ overflowing. The server is probably stuck in an infinite loop.
[ 22727.289]
Backtrace:
[ 22727.299] 0: /usr/bin/X (xorg_backtrace+0x26) [0x4a2626]
[ 22727.299] 1: /usr/bin/X (mieqEnqueue+0x1f4) [0x4a18e4]
[ 22727.299] 2: /usr/bin/X (xf86PostMotionEventM+0x97) [0x47d767]
[ 22727.299] 3: /usr/lib/xorg/modules/input/evdev_drv.so (0x7f3a69e9e000+0x5ff3) [0x7f3a69ea3ff3]
[ 22727.299] 4: /usr/lib/xorg/modules/input/evdev_drv.so (0x7f3a69e9e000+0x668d) [0x7f3a69ea468d]
[ 22727.299] 5: /usr/bin/X (0x400000+0x6c657) [0x46c657]
[ 22727.299] 6: /usr/bin/X (0x400000+0x124bbe) [0x524bbe]
[ 22727.300] 7: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f3a6f3fc000+0xfc60) [0x7f3a6f40bc60]
[ 22727.300] 8: /lib/x86_64-linux-gnu/libc.so.6 (ioctl+0x7) [0x7f3a6e404297]
[ 22727.300] 9: /lib/x86_64-linux-gnu/libdrm.so.2 (drmIoctl+0x28) [0x7f3a6c99e058]
[ 22727.300] 10: /lib/x86_64-linux-gnu/libdrm.so.2 (drmCommandWrite+0x1b) [0x7f3a6c9a030b]
[ 22727.300] 11: /lib/x86_64-linux-gnu/libdrm_nouveau.so.1 (0x7f3a6c35b000+0x2b87) [0x7f3a6c35db87]
[ 22727.300] 12: /lib/x86_64-linux-gnu/libdrm_nouveau.so.1 (nouveau_bo_map_range+0xfe) [0x7f3a6c35e19e]
[ 22727.300] 13: /lib/x86_64-linux-gnu/libdrm_nouveau.so.1 (0x7f3a6c35b000+0x1c2d) [0x7f3a6c35cc2d]
[ 22727.300] 14: /lib/x86_64-linux-gnu/libdrm_nouveau.so.1 (nouveau_pushbuf_flush+0x1ae) [0x7f3a6c35d22e]
[ 22727.300] 15: /usr/lib/xorg/modules/libexa.so (0x7f3a6b90d000+0x9015) [0x7f3a6b916015]
[ 22727.300] 16: /usr/lib/xorg/modules/libexa.so (0x7f3a6b90d000+0xbba1) [0x7f3a6b918ba1]
[ 22727.300] 17: /usr/bin/X (0x400000+0xdcab9) [0x4dcab9]
[ 22727.300] 18: /usr/lib/xorg/modules/libexa.so (0x7f3a6b90d000+0xd1c1) [0x7f3a6b91a1c1]
[ 22727.300] 19: /usr/bin/X (0x400000+0xdc1b5) [0x4dc1b5]
[ 22727.300] 20: /usr/bin/X (0x400000+0xd5638) [0x4d5638]
[ 22727.300] 21: /usr/bin/X (0x400000+0x2e2a9) [0x42e2a9]
[ 22727.300] 22: /usr/bin/X (0x400000+0x21a7e) [0x421a7e]
[ 22727.300] 23: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xff) [0x7f3a6e345eff]
[ 22727.300] 24: /usr/bin/X (0x400000+0x21629) [0x421629]

Giovanni Mellini (merlos) wrote :

The bug is still happening to me with a fully updated Natty.
How I can set compiz to run with vblank_mode=0 and vsync disabled to prevent this, as suggested on comment #32 ?

Kept on the Unity radar because we are affected, but not a Unity or Compiz bug per-se, as confirmed by the previous comments.

summary: - compiz hangs randomly several times per day
+ Something blocks compiz randomly several times per day
Changed in unity:
milestone: 3.8.12 → 3.8.14
Chris Halse Rogers (raof) wrote :

I've sent a patch series fixing this upstream - the core is http://lists.freedesktop.org/archives/dri-devel/2011-April/010608.html .

Once that's had some review upstream it should also go to stable, and it should be SRUable.

Changed in compiz (Ubuntu Natty):
milestone: ubuntu-11.04 → natty-updates
Changed in compiz (Ubuntu):
milestone: ubuntu-11.04 → none
Changed in linux (Ubuntu Natty):
milestone: none → natty-updates
Changed in unity (Ubuntu Natty):
milestone: none → natty-updates
Kate Stewart (kate.stewart) wrote :

Cleaning up disposition of bug for the linux kernel task. Since there's a kernel patch there, am going ahead and marking it a triaged. Also the priority should be set from the prior task, which was High.
For Natty - this should go in natty-updates. It should also go into development for Oneiric.

tags: added: kernel-key
Changed in linux (Ubuntu):
importance: Undecided → High
Changed in linux (Ubuntu Natty):
importance: Undecided → High
Changed in linux (Ubuntu):
status: New → Triaged
Changed in linux (Ubuntu Natty):
status: New → Triaged
Changed in linux (Ubuntu):
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu Natty):
assignee: nobody → Canonical Kernel SRU Team (canonical-kernel-sru-team)
Marco Cimmino (cimmo) wrote :

#718858 and #755099 are same issues of this?

Is there any way I can apply the upstream patch now?

Chris Coulson (chrisccoulson) wrote :

Undoing the duplicate. If you are going to make bugs as duplicates of each other, please ensure that the master bug is the one with the most useful information in it (in this case, that is this bug as it already has quite extensive debugging information in it)

Robert Hooker (sarvatt) on 2011-05-04
tags: added: blocks-hwcert hwe-blocker
Changed in linux (Ubuntu):
status: Triaged → Fix Released
28 comments hidden view all 108 comments
yiourkas (yiourkas) wrote :

# uname -a
Linux Troy 2.6.38-9-generic #43 SMP Wed May 4 16:42:41 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

This is the kernel in post #46

Dang. The 2.6.38-9 definitely doesn't fix it for me. It's reproducible in both Unity and the Ubuntu Classic Desktop. I'm going to try the 2.6.39-1.6 now.

summary: - Something blocks compiz randomly several times per day
+ Disabling an output can cause vblank events to be missed
description: updated
Jimmy Merrild Krag (beruic) wrote :

I've noticed that for me it only occurs after I've been away from the machine for a while. Often with the lid closed.

I as well, the test kernel extends the amount of time the machine can stay alive without freezing, but it still freezes after a few hours of having the lid closed.

leomeloxp (leomeloxp) wrote :

Hi everyone I think I got this bug running Natty on a Dell Inspiron 1545 Laptop. For me it always happened when the laptop was left idle, sometimes after the display turned off others even before that (possibly with the lid closed). I'll try the Kernel suggested on #46 and #53. I'm subscribing and will report back if I get it again. If anyone can tell me how to help debugging it I'll be happy to be of any help.

Do your best everyone o/

I have been trying to get reproduce the bug to capture dmesg,
i915_error_state and even backtraces from my gdm session, but I am not
sure what triggers it yet.

So far, it seems that it freezes when the screensaver was running for
some time BECAUSE IT HAD TURNED MY SCREEN OFF due to the energy policy...

I had since disengaged the energy control option to never turn my screen
off again, so far it has not frozen my system again. I suspect that this
could be the root cause, but I am not sure how to prove it, or what
packages could be involved to further the investigation.

It also seems to happen only on my systems with intel video boards
(Mobile GM965/GL960 Lenovo Laptop and Mobile 945GME Express Asus
netbook).

2011/5/11 leomeloxp <email address hidden>

> Hi everyone I think I got this bug running Natty on a Dell Inspiron 1545
> Laptop. For me it always happened when the laptop was left idle,
> sometimes after the display turned off others even before that (possibly
> with the lid closed). I'll try the Kernel suggested on #46 and #53. I'm
> subscribing and will report back if I get it again. If anyone can tell
> me how to help debugging it I'll be happy to be of any help.
>
> Do your best everyone o/
>
> --
> You received this bug notification because you are a direct subscriber
> of a duplicate bug (768579).
> https://bugs.launchpad.net/bugs/740126
>
> Title:
> Disabling an output can cause vblank events to be missed
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/unity/+bug/740126/+subscribe
>

--
Fábio Leitão
..-. .- -... .. --- .-.. . .. - .- --- ...-.-

I can confirm that only Intel boards are affected, all my ATI-equipped machines are fine.

Alf Sagen (alfsagen) wrote :

Hello,
I'm running on a ThinkPad T500 with dual video. Currently have been running with the build-in Intel graphics adaptor (due to battery life). I may switch to the external ATI Radeon 3850 (I think that's what my laptop has...) and test if that resolves the problem.

One theory I have is that the problem is related to the Compiz process providing 3D graphics for Unity, and that this is related to the HW we are running; i.e. if we run the Intel on-board graphics adaptor, it might not have 3D support in HW, hence demanding other things from the SW than if we run on 3D enabled graphics adaptors. Does this sound as a possibility, anyone??

My solution to the problem (when coming back to my laptop with "frozen" screensaver) has been to press CTRL+ALT+F1 and then log in to the console in tty1. From here, I run ps -ef --sort time to see which processes have been running for a long time.
 1) kill <pid of gnome screen-saver> does not help --> doesn't even remove the screensaver photo from the display
     even though the process dies.
 2) kill <pid of compiz> does not kill the process --> no change
 3) kill -9 <pid of compiz> does actually kill compiz, and at this point pressing CTRL+ALT+F7 (or F8) takes me back to
     the Gnome session, which is now flashing a bit, then refreshing and coming up again, seemingly normal.

Most of the times, I'm able to get back into my session with all applications still running after step 3) above.
However, it has also happened that I got a new, blank session with no apps running.

My 5 cents today is that this is a problem related to compiz and 3D rendering, and that it might be related to the graphics adaptor used, including whether the computer is equipped with 3D suppot in HW.

Brgds,
/alfs

Seth Forshee (sforshee) wrote :

Did anyone ever ever send the fix from comment #45 up to stable? It doesn't appear to be in any of the .38 stable releases thus far, nor is it in the stable queue.

tekstr1der (tekstr1der) wrote :

Just wanted to report back after two weeks running 2.6.39-rc6, rc7, and the .39 release with the patch from comment #45, I've never had this UI freeze again. It was happening several times per day previously.

Hopefully the patch is cherrypicked and lands in an SRU for the 2.6.38.x branch.

Peter Fein (pfein) wrote :

The kernel in #46 appears to solve the blank-screen-with-cursor problem for me under lid-suspend and `xset dpms force off` (Thinkapd X200 tablet - Intel GM45).

Will try with external monitor when I get home. Backport please!

Peter Fein (pfein) wrote :

Kernel in #46 solves same problem when plugging an external monitor.

Seth Forshee (sforshee) wrote :

Chris, it looks like you did submit it for stable but it wasn't picked up based on technicalities.

http://thread.gmane.org/gmane.linux.kernel.stable/10721

Do you plan to resubmit?

Has anyone else NOT found the test kernel to fix this bug, or am I the only one?

Chris Halse Rogers (raof) wrote :

I've performed the requisite supplications; this patch is now pending
review in the stable-queue.

It should be possible to pick from pre-stable for the next proposed
kernel.

David Barth (dbarth) wrote :

Wow, great news. I'm moving the unity part of the bug to SRU2 to keep it on the radar. Thanks for looking into this.

Changed in unity:
milestone: 3.8.14 → 3.8.16
Brandon Applegate (vom) wrote :

Sorry if this is answered (indirectly even) elsewhere - but is this fix (patch from #83) expected to be be backported to a natty kernel ? I understand that the patch is for 2.6.39 so it would have to be backported to 2.6.38 for Natty yes ? Would this come in natty-updates or natty-backports ? I guess I'm mixing the usage of the term 'backport' here (i.e. literal repo term vs. generalized term) - sorry.

I really want to go to 11.04 but this bug is a showstopper. Especially running any virtualization - X crash of this nature means you are losing N running OSes of work :( I feel less crazy reading #11 re: not shipping with this bug (I agree). I understand this was listed in the Known Issues section of release notes, but still this is pretty serious. Disabling screen locking is not doable for anyone in a security conscious environment (i.e. workplace).

Seth Forshee (sforshee) wrote :

Brandon, the patch will come into natty. It was included in the 2.6.38.8 stable release, and natty will pull it in from there. It's too late to get it into the next proposed kernel, but I'll post here once the fix does get to proposed (should be 3 weeks from now).

David Barth (dbarth) wrote :

SRU3, as it takes a kernel update.

Changed in unity:
milestone: 3.8.16 → 3.8.18
Mauro (ephestione) wrote :

Just to fill in my little bit of experience (as I've been reading about the same glitch I notice): other than the prettyor ugly workarounds proposed in this thread, Ctrl-Alt-Bksp works as well.

I'm running on a toshiba satellite P200, with GM965/GL960 integrated chip, and I've noticed a couple more behaviours not listed here:

1) When restoring from screensaver's screen power off, I see the normal desktop screen as if the screensaver never kicked in (windows are all there), I can move the pointer around, but it's as if the screensaver was actually on; in other words, it's a variant of the "blank frozen screensaver" where instead of the blank screen I see the normal desktop with which I cannot interact; more often than not, when that happens, after a while the screen goes back to blank and I get the regular password dialogue, from which I can then proceed to uset he laptop normally; this time though, it just stayed like that, visibile windows on the background, pointer moveable with no effects whatsoever... Ctrl-Alt-Bksp for me

2) If I try and do ctrl-alt-f1, and then again ctrl-alt-f7, I often get no the blank screen anymore, but the last text screen before gdm started, as if I had set the verbose boot mode on, but that screen i cannot interact with, I can type characters in there that are displayed on screen, but issuing commands or key combinations deals no effect (including ctrl-alt-bksp); sometimes, switching back to ttty1 and then to ctrl-alt-f7,a few times, shows the blank screen again, and I can ctrl-alt-bksp out of it.

HX_unbanned (linards-liepins) wrote :

Unity 3.8.16 fixing only partially the problem.

Before the 3.8.16, once the problem occurred only cure was rebooting the PC. Now - cure is just pressing "Show Desktop".

Using Ubuntu Classic ( with effects ).

Please give a note of what logs you need, Sam ;)

Ara Pulido (ara) on 2011-06-21
Changed in unity (Ubuntu):
assignee: nobody → Canonical Desktop Experience Team (canonical-dx-team)
Changed in unity (Ubuntu Natty):
assignee: nobody → Canonical Desktop Experience Team (canonical-dx-team)
Marco Cimmino (cimmo) wrote :

Comment #86 said 3 weeks, 4 weeks has been past, where is this update?

Seth Forshee (sforshee) wrote :

@Marco, unfortunately we've had some delays in releasing a kernel update to natty due to regressions that we picked up from the upstream stable kernel. As a result we had to temporarily stopped taking stable updates. All known regressions are fixed in the current proposed kernel, and assuming no more regressions are found it should move to updates soon, at which point we can start taking upstream stable updates again. Sorry for the delay.

Marco Cimmino (cimmo) wrote :

Honestly you could have taken only the patch we are talking about here.
This kernel is so unstable for me!

Marco Cimmino (cimmo) wrote :

5 weeks past from the promised patch (10 from the release of Natty), kernel 2.6.38-10 just released with even an ABI bump, a good excuse to include the fix but no :(

I am a bit disappointed.

Tim Gardner (timg-tpi) wrote :

Marco - subscribe to https://launchpad.net/~kernel-ppa/+archive/pre-proposed?field.series_filter=natty for the latest crack (which has your fix)

Seth Forshee (sforshee) wrote :

The fix is now in proposed. Please test the proposed kernel and verify that this issue is no longer present. Thanks!

Changed in linux (Ubuntu Natty):
status: Triaged → Fix Committed
David Barth (dbarth) wrote :

As far as Unity is concerned, I think the issue has been fixed for a while by making sure we flush the X11 connection at some critical points in the flow. I'm leaving the bug open but putting the DX team off the hook from now on.

Changed in unity (Ubuntu):
assignee: Canonical Desktop Experience Team (canonical-dx-team) → nobody
status: Triaged → Fix Committed
Changed in unity (Ubuntu Natty):
status: Triaged → Fix Committed
assignee: Canonical Desktop Experience Team (canonical-dx-team) → nobody
Changed in unity:
status: Triaged → Fix Released
Changed in compiz (Ubuntu):
status: Triaged → Invalid
Changed in compiz (Ubuntu Natty):
status: Triaged → Invalid
Marco Cimmino (cimmo) wrote :

> The fix is now in proposed.

Where? I can get only 2.6.38.11.26 where I don't see this fix in the changelog.

Seth Forshee (sforshee) wrote :

Marco: That's the version of the meta package. The meta package depends on the correct version of the kernel/headers packages, so updating the meta packages will install the packages for the proposed kernel. The names of the real packages start with linux-image or linux-headers, not linux-meta.

See the instructions at https://wiki.ubuntu.com/Testing/EnableProposed if you need more information about getting proposed packages.

Seth Forshee (sforshee) wrote :

Ignore what I said about the package names in the last comment; it is incorrect. There are linux-image and linux-headers meta packages that will have the same version number you quoted above, but those packages don't install the kernel, they just depend on the most up-to-date kernel package.

Marco Cimmino (cimmo) wrote :

Seth yes the correct versions of the kernel in proposed are:
linux-image v2.6.38-11.47
linux-headers v2.6.38-11.47

but that version is from Fri, 15 Jul 2011 that is 4 days earlier than your comment.
Now that I can see is based on upstream kernel v2.6.38.8 so that might be the right one, so you are saying it just took 4 days to drop a comment here? Is that the right kernel with the fix? Can we have more clarity in the future?

It looks like it is fixed.
Using linux-image-2.6.38-11-generic-pae version 2.6.38-11.47 from natty-proposed.

Seth Forshee (sforshee) wrote :

Marco: The package was copied to the proposed pocket on 19 July, the same day that I posted the comment.

Danilo: Thank you for verifying the fix.

Marco Cimmino (cimmo) wrote :

I am running the proposed kernel since 5 days and so far not one single freeze, seems promising.

Please reopen this bug: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/775060
It isn't a duplicate, as for other people the proposed kernel fixed the problem and for me it didn't. Really annoying.

Timo Aaltonen (tjaalton) wrote :

Nick: done

leomeloxp (leomeloxp) wrote :

I am using Linux Kernel 3 on Natty and now I get the problem again.

I was with the proposed kernel with the fix found in here (see #45) and I never had a freeze again, but after upgrading to Linux 3 my system got really bad. It's heavier to operate, takes ages to boot and I get the freeze whenever I close my laptop lid or let it rest for more than 20 minutes.

If there's any info about what I should do (even downgrading would be a possibility, I just can't activate the menu on my boot to select the kernels, and I'm on a Natty only computer)

Thanks everyone.

Marco Cimmino (cimmo) wrote :

Update works perfectly for me and I believe this bug should be closed.
For any unrelated problem with unrelated kernel versions might be filed under a different ticket I guess.

Thanks.

Timo Aaltonen (tjaalton) wrote :

Yep, closing the kernel task as fixed.

Changed in linux (Ubuntu Natty):
status: Fix Committed → Fix Released
Ayan George (ayan) on 2011-08-31
tags: added: blocks-hwcert-enablement
Chris Van Hoof (vanhoof) on 2011-11-15
Changed in unity (Ubuntu):
status: Fix Committed → Fix Released
Changed in unity (Ubuntu Natty):
status: Fix Committed → Fix Released
Displaying first 40 and last 40 comments. View all 108 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers