Xorg crashes when exiting steam segfault in sna_pixmap_set_dri()

Bug #1127497 reported by Zygmunt Krynicki
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
xserver-xorg-video-intel (Ubuntu)
Fix Released
High
Chris Wilson

Bug Description

On current up-to-date raring with equally up-to-date steam (version should be visible with apport-provided data) exiting steam with the ALT+F4 shortcut crashes X like this:

[196022.622] (EE) Backtrace:
[196022.622] (EE) 0: /usr/bin/X (xorg_backtrace+0x36) [0x7f260cb2def6]
[196022.622] (EE) 1: /usr/bin/X (0x7f260c97e000+0x1b3d39) [0x7f260cb31d39]
[196022.622] (EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f260ba85000+0xfbd0) [0x7f260ba94bd0]
[196022.622] (EE) 3: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7f260922c000+0xd40a1) [0x7f26093000a1]
[196022.622] (EE) 4: /usr/bin/X (0x7f260c97e000+0x180e5c) [0x7f260cafee5c]
[196022.622] (EE) 5: /usr/bin/X (0x7f260c97e000+0x1818cb) [0x7f260caff8cb]
[196022.622] (EE) 6: /usr/bin/X (DRI2GetBuffersWithFormat+0x10) [0x7f260caffc10]
[196022.622] (EE) 7: /usr/bin/X (0x7f260c97e000+0x183690) [0x7f260cb01690]
[196022.622] (EE) 8: /usr/bin/X (0x7f260c97e000+0x58a41) [0x7f260c9d6a41]
[196022.622] (EE) 9: /usr/bin/X (0x7f260c97e000+0x4754a) [0x7f260c9c554a]
[196022.622] (EE) 10: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xf5) [0x7f260a6d2ea5]
[196022.622] (EE) 11: /usr/bin/X (0x7f260c97e000+0x47891) [0x7f260c9c5891]
[196022.622] (EE)
[196022.622] (EE) Segmentation fault at address 0x7e

ProblemType: Bug
DistroRelease: Ubuntu 13.04
Package: xserver-xorg-video-intel 2:2.21.2-0ubuntu1
ProcVersionSignature: Ubuntu 3.8.0-6.11-generic 3.8.0-rc7
Uname: Linux 3.8.0-6-generic x86_64
NonfreeKernelModules: wl
.tmp.unity.support.test.0:

ApportVersion: 2.8-0ubuntu4
Architecture: amd64
CompizPlugins: No value set for `/apps/compiz-1/general/screen0/options/active_plugins'
CompositorRunning: compiz
CompositorUnredirectDriverBlacklist: '(nouveau|Intel).*Mesa 8.0'
CompositorUnredirectFSW: true
Date: Sat Feb 16 21:39:48 2013
DistUpgraded: Fresh install
DistroCodename: raring
DistroVariant: ubuntu
DkmsStatus:
 bcmwl, 6.20.155.1+bdcom, 3.8.0-6-generic, x86_64: installed
 virtualbox, 4.1.22, 3.8.0-6-generic, x86_64: installed
ExtraDebuggingInterest: Yes
GraphicsCard:
 Intel Corporation 3rd Gen Core processor Graphics Controller [8086:0166] (rev 09) (prog-if 00 [VGA controller])
   Subsystem: Lenovo Device [17aa:3977]
InstallationDate: Installed on 2013-02-12 (4 days ago)
InstallationMedia: Ubuntu 13.04 "Raring Ringtail" - Alpha amd64 (20130210)
MachineType: LENOVO 2189
MarkForUpload: True
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=pl_PL.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.8.0-6-generic.efi.signed root=UUID=f4d68fc6-d76a-475e-8d08-80c5eb0cc48e ro quiet splash vt.handoff=7
SourcePackage: xserver-xorg-video-intel
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 09/14/2012
dmi.bios.vendor: LENOVO
dmi.bios.version: 5ECN92WW(V8.04)
dmi.board.asset.tag: No Asset Tag
dmi.board.name: INVALID
dmi.board.vendor: LENOVO
dmi.board.version: 31900003WIN8 STD MLT
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Lenovo G580
dmi.modalias: dmi:bvnLENOVO:bvr5ECN92WW(V8.04):bd09/14/2012:svnLENOVO:pn2189:pvrLenovoG580:rvnLENOVO:rnINVALID:rvr31900003WIN8STDMLT:cvnLENOVO:ct10:cvrLenovoG580:
dmi.product.name: 2189
dmi.product.version: Lenovo G580
dmi.sys.vendor: LENOVO
version.compiz: compiz 1:0.9.9~daily13.02.08-0ubuntu1
version.ia32-libs: ia32-libs N/A
version.libdrm2: libdrm2 2.4.42-0ubuntu1
version.libgl1-mesa-dri: libgl1-mesa-dri 9.0.2-0ubuntu1
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A
version.libgl1-mesa-glx: libgl1-mesa-glx 9.0.2-0ubuntu1
version.xserver-xorg-core: xserver-xorg-core 2:1.13.2-0ubuntu2
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.7.3-0ubuntu2
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:7.1.0-0ubuntu1
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.21.2-0ubuntu1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.6-0ubuntu2
xserver.bootTime: Sat Feb 16 21:39:07 2013
xserver.configfile: default
xserver.errors:

xserver.logfile: /var/log/Xorg.0.log
xserver.version: 2:1.13.2-0ubuntu2
xserver.video_driver: intel

Revision history for this message
Zygmunt Krynicki (zyga) wrote :
Revision history for this message
Chris Wilson (ickle) wrote :

Can you please "apt-get install xserver-xorg-video-intel-dbg && addr2line -e /usr/lib/xorg/modules/drivers/intel_drv.so 0xd40a1"

Revision history for this message
Chris Wilson (ickle) wrote :

Great, thanks. Installed Steam to see if I can reproduce this bug. Before I knew it, it was already dawn. Yikes!

Revision history for this message
Chris Wilson (ickle) wrote :

I'm still waiting upon a symbolic stacktrace.

Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Incomplete
Revision history for this message
Bryce Harrington (bryce) wrote :

Please collect a full backtrace on this crash - see http://wiki.ubuntu.com/X/Backtracing for guidance.

In answer to #2, from my raring chroot:

/build/buildd/xserver-xorg-video-intel-2.21.2/build/src/sna/../../../src/sna/sna_dri.c:186

Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → New
importance: Undecided → High
status: New → Incomplete
Revision history for this message
Chris Wilson (ickle) wrote :

Hmm, that would be sna_pixmap_set_dri() crashing on priv->gpu_bo being NULL. But I can not see a way for sna_pixmap_move_to_gpu() to not returning with priv->gpu_bo initialised (or set_dri() bailing). So I would like to see that stacktrace confirmed.

Revision history for this message
Chris Wilson (ickle) wrote :

Also, it would be useful if you could retest with ppa:xorg-edgers if you can reproduce the bug. I've been working on adding more assertions to the DRI handling which flagged a few potential issues - though I can't tell if the bug you've encountered is one of those.

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Can you please "apt-get install xserver-xorg-video-intel-dbg && addr2line -e /usr/lib/xorg/modules/drivers/intel_drv.so 0xd40a1"

That gives me sna_dri.c:186

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

I can no longer reproduce the crash. Steam was updated in the meantime so it could have been fixed (workaround?) by valve

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

I just got it, it's still here:

Program received signal SIGSEGV, Segmentation fault.
0x00007fa7fcea80a1 in sna_pixmap_set_dri (pixmap=0x7fa802bacbc0, sna=0x7fa800454010) at ../../../src/sna/sna_dri.c:186
186 ../../../src/sna/sna_dri.c: Nie ma takiego pliku ani katalogu.

Sadly I could not capture the traceback, perhaps lightdm/plymouth/upstart somehow kill/respawn something.

I'll try again (following X debugging guidelines)

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Program received signal SIGSEGV, Segmentation fault.
0x00007f868f35c0a1 in sna_pixmap_set_dri (pixmap=0x7f8694200b20, sna=0x7f8692908010) at ../../../src/sna/sna_dri.c:186
186 ../../../src/sna/sna_dri.c: Nie ma takiego pliku ani katalogu.
(gdb) backtrace full
#0 0x00007f868f35c0a1 in sna_pixmap_set_dri (pixmap=0x7f8694200b20, sna=0x7f8692908010) at ../../../src/sna/sna_dri.c:186
        priv = 0x7f8694205160
        tiling = 1
#1 sna_dri_create_buffer (draw=<optimized out>, attachment=0, format=32) at ../../../src/sna/sna_dri.c:280
        sna = 0x7f8692908010
        buffer = <optimized out>
        pixmap = <optimized out>
        bo = <optimized out>
        flags = 1
        size = <optimized out>
        bpp = <optimized out>
#2 0x00007f8692b5ae5c in ?? ()
No symbol table info available.
#3 0x00007f8692b5b7f6 in ?? ()
No symbol table info available.
#4 0x00007f8692b5bc10 in DRI2GetBuffersWithFormat ()
No symbol table info available.
#5 0x00007f8692b5d690 in ?? ()
No symbol table info available.
#6 0x00007f8692a32a41 in ?? ()
No symbol table info available.
#7 0x00007f8692a2154a in ?? ()
No symbol table info available.
#8 0x00007f869072eea5 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#9 0x00007f8692a21891 in _start ()
No symbol table info available.

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

(gdb) info locals
priv = 0x7f8694205160
tiling = 1
(gdb) print *priv
$3 = {
  pixmap = 0x7f8694200b20,
  gpu_bo = 0x0,
  cpu_bo = 0x7f8693e673a0,
  gpu_damage = 0x7f8693b996e1,
  cpu_damage = 0x0,
  ptr = 0x7f868eb58000,
  list = {
    next = 0x7f8694205190,
    prev = 0x7f8694205190
  },
  stride = 1280,
  clear_color = 0,
  flush = 1,
  source_count = 4,
  pinned = 0 '\000',
  create = 11 '\v',
  mapped = 0 '\000',
  shm = 0 '\000',
  clear = 0 '\000',
  header = 0 '\000',
  cpu = 0 '\000'
}

Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → New
Revision history for this message
Chris Wilson (ickle) wrote :

That explains how we got through move_to_gpu, but doesn't reveal how we set it to be NULL in the first place. Thanks.

Do you have a better idea of how to trigger it now?

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

I don't really know what triggers it. I tried a number of activities - launching steam and immediately exiting it (via menu, indicator, or accelerator), launching a game (I only tried uplink) and exiting. In the end I don't see a pattern. I saw crashes immediately after starting and long after keeping steam open in the background.

The only thing that I could add that is 'odd' is that I have not re-enabled workspaces after reinstalling the system from scratch.

Bryce Harrington (bryce)
Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Triaged
summary: - Xorg crashes when exiting steam
+ Xorg crashes when exiting steam segfault in sna_pixmap_set_dri()
Revision history for this message
Chris Wilson (ickle) wrote :

I am almost 100% sure I have this fixed in upstream. I've been through all the places where gpu_bo/gpu_damage can become out-of-sync and validated that the checks are now inplace. If you can find the time to test with ppa:xorg-edgers (dated from today) that would be extremely useful.

Changed in xserver-xorg-video-intel (Ubuntu):
assignee: nobody → Chris Wilson (ickle)
Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Testing, I'll let you know if it crashes

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

It crashed again.

For reference, I've added the PPA, updated everything, logged out, checked that I don't have xorg config file (so that I keep getting sna by default).

Program received signal SIGSEGV, Segmentation fault.
0x00007f79d99ab5d0 in sna_pixmap_set_dri (pixmap=0x7f79de9d5220, sna=0x7f79dcf58010)
    at ../../../src/sna/sna_dri.c:186
186 ../../../src/sna/sna_dri.c: Nie ma takiego pliku ani katalogu.
(gdb) backtrace full
#0 0x00007f79d99ab5d0 in sna_pixmap_set_dri (pixmap=0x7f79de9d5220, sna=0x7f79dcf58010)
    at ../../../src/sna/sna_dri.c:186
        priv = 0x7f79dead6940
        tiling = -1
#1 sna_dri_create_buffer (draw=<optimized out>, attachment=0, format=32)
    at ../../../src/sna/sna_dri.c:274
        sna = 0x7f79dcf58010
        buffer = <optimized out>
        pixmap = <optimized out>
        bo = <optimized out>
        flags = 1
        size = <optimized out>
        bpp = <optimized out>
#2 0x00007f79dd1aae5c in ?? ()
No symbol table info available.
#3 0x00007f79dd1ab7f6 in ?? ()
No symbol table info available.
#4 0x00007f79dd1abc10 in DRI2GetBuffersWithFormat ()
No symbol table info available.
#5 0x00007f79dd1ad690 in ?? ()
No symbol table info available.
#6 0x00007f79dd082a41 in ?? ()
No symbol table info available.
#7 0x00007f79dd07154a in ?? ()
No symbol table info available.
#8 0x00007f79dad7eea5 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#9 0x00007f79dd071891 in _start ()
No symbol table info available.
(gdb)

The gdb session is still active, ask for more data if you want

Revision history for this message
Chris Wilson (ickle) wrote :

Frustating. Can I ask you to build from scratch with debugging enabled:

# sudo apt-get build-dep xserver-xorg-video-intel
# git clone git://anongit.freedesktop.org/xorg/driver/xf86-video-intel
# cd xf86-video-intel
# ./autogen.sh --prefix=/usr --with-default-accel=sna --enable-debug
# make && sudo make install

That will (temporarily) replace your intel_drv.so with assertions enabled. It is likely to crash much earlier - hopefully closer to the actual culprit.

Revision history for this message
Chris Wilson (ickle) wrote :

Whilst you have the gdb, p *priv again.

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

(gdb) p *priv
$2 = {
  pixmap = 0x7f79de9d5220,
  gpu_bo = 0x0,
  cpu_bo = 0x7f79dead4330,
  gpu_damage = 0x7f79de849421,
  cpu_damage = 0x0,
  ptr = 0x7f79d6499000,
  list = {
    next = 0x7f79dead6970,
    prev = 0x7f79dead6970
  },
  stride = 1280,
  clear_color = 0,
  source_count = 4,
  pinned = 0 '\000',
  create = 11 '\v',
  mapped = 0 '\000',
  flush = 0 '\000',
  shm = 0 '\000',
  clear = 0 '\000',
  header = 0 '\000',
  cpu = 0 '\000'
}

I'll build the tree and get back to you

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Ok, built and ready, trying to get it to crash again

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Nothing, is there a chance it got fixed along the way? Do you want me to try bisecting things?

Revision history for this message
Chris Wilson (ickle) wrote :

I'd give it a bit of time before declaring success. Since you've reported the issue I have been trying to add some paranoia to that layer and harden it, which is why I was optimist I had fixed something. Can you tell me the package name you installed from xorg-edgers, so that I can check to see what changes had been made since?

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

I didn't install any package, just updated all my packages which included xserver-xorg-video-intel. The version apt reports is 2:2.21.2+git20130213.9861423a-0ubuntu0sarvatt. This is an upgrade from 2:2.21.2-0ubuntu0

Revision history for this message
Chris Wilson (ickle) wrote :

Thanks, that suggests

commit 1f16d854264ea923303b79379266bd789fd9dd4d
Author: Chris Wilson <email address hidden>
Date: Mon Feb 18 14:30:55 2013 +0000

    sna/dri: Prevent swapping a decoupled DRI2Buffer

    If the DRI2Buffer is no longer valid for the Drawable, for example the
    window had just been reparent, just complete the swap without triggering
    any assertions.

as being the most likely fix. Doesn't quite fit the symptoms though, so I'm still worried there is another bug lurking. :|

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

I can revert that and test if it exposes the issue.

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

That was quick. I didn't have to exit steam though, just launch it:

Program received signal SIGABRT, Aborted.
0x00007ff335664037 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007ff335664037 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ff335667698 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007ff33565ce03 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007ff33565ceb2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x00007ff33426d164 in sna_dri_schedule_swap (client=0x7ff339c4bab0, draw=0x7ff33a09b140,
    front=0x7ff339cb1100, back=0x7ff339b97700, target_msc=0x7fffa3cb6a28, divisor=0, remainder=0,
    func=0x7ff337a7cef0, data=0x7ff33a09b140) at sna_dri.c:2140
#5 0x00007ff337a7c224 in DRI2SwapBuffers ()
#6 0x00007ff337a7d3b4 in ?? ()
#7 0x00007ff337952a41 in ?? ()
#8 0x00007ff33794154a in ?? ()
#9 0x00007ff33564eea5 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#10 0x00007ff337941891 in _start ()
(gdb) git backtrace
Undefined command: "git". Try "help".
(gdb) backtrace full
#0 0x00007ff335664037 in raise () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1 0x00007ff335667698 in abort () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#2 0x00007ff33565ce03 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#3 0x00007ff33565ceb2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#4 0x00007ff33426d164 in sna_dri_schedule_swap (client=0x7ff339c4bab0, draw=0x7ff33a09b140,
    front=0x7ff339cb1100, back=0x7ff339b97700, target_msc=0x7fffa3cb6a28, divisor=0, remainder=0,
    func=0x7ff337a7cef0, data=0x7ff33a09b140) at sna_dri.c:2140
        screen = <optimized out>
        scrn = <optimized out>
        sna = 0x7ff337828010
        vbl = {request = {
            type = (DRM_VBLANK_HIGH_CRTC_MASK | DRM_VBLANK_EVENT | DRM_VBLANK_NEXTONMISS | DRM_VBLANK_SECONDARY | unknown: 58884096), sequence = 32755, signal = 140682528728598}, reply = {
            type = (DRM_VBLANK_HIGH_CRTC_MASK | DRM_VBLANK_EVENT | DRM_VBLANK_NEXTONMISS | DRM_VBLANK_SECONDARY | unknown: 58884096), sequence = 32755, tval_sec = 140682528728598,
            tval_usec = 4230521311723520}}
        pipe = <optimized out>
        info = 0x0
        current_msc = 8
        __PRETTY_FUNCTION__ = "sna_dri_schedule_swap"
#5 0x00007ff337a7c224 in DRI2SwapBuffers ()
No symbol table info available.
#6 0x00007ff337a7d3b4 in ?? ()
No symbol table info available.
#7 0x00007ff337952a41 in ?? ()
No symbol table info available.
#8 0x00007ff33794154a in ?? ()
No symbol table info available.
#9 0x00007ff33564eea5 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#10 0x00007ff337941891 in _start ()
No symbol table info available.

Revision history for this message
Chris Wilson (ickle) wrote :

That's one of the assertions recently added to track down an issue whereby the flushing flag on the pixmap and the GPU bo became inconsistent - and yes the patch you reverted was a direct consequence. ;-)

Can you keep running master with assertions enabled and let me know if it crashes?

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Ok, I'll undo the revert and keep pushing it

Chris Wilson (ickle)
Changed in xserver-xorg-video-intel (Ubuntu):
status: Triaged → Fix Committed
bugbot (bugbot)
tags: added: crash
Chris Wilson (ickle)
Changed in xserver-xorg-video-intel (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.