xorg segfaults in libglx.so(__glXleaveServer+0x22)

Bug #60288 reported by Tormod Volden on 2006-09-13
64
Affects Status Importance Assigned to Milestone
X.Org X server
Fix Released
Medium
xorg-server (Ubuntu)
High
Unassigned
Edgy
Medium
Unassigned

Bug Description

Xorg suddenly dies, often when I leave it alone for some time (makes me think there is something with the screensaver kicking in):

Backtrace:
0: /usr/X11R6/bin/X(xf86SigHandler+0x81) [0x80c3861]
1: [0xffffe420]
2: /usr/lib/xorg/modules/extensions/libglx.so [0xb7c9a946]
3: /usr/lib/xorg/modules/extensions/libglx.so(__glXleaveServer+0x22) [0xb7c76c62
]
4: /usr/lib/xorg/modules/extensions/libglx.so [0xb7c772fe]
5: /usr/X11R6/bin/X(Dispatch+0x18f) [0x808690f]
6: /usr/X11R6/bin/X(main+0x485) [0x806e715]
7: /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xdc) [0xb7db98cc]
8: /usr/X11R6/bin/X(FontFileCompleteXLFD+0xa1) [0x806da51]

edgy
xserver-xorg-core 1.1.1-0ubuntu10
xserver-xorg-video-savage 2.1.1-0ubuntu2

Related branches

Tormod Volden (tormodvolden) wrote :
Tormod Volden (tormodvolden) wrote :
Tormod Volden (tormodvolden) wrote :

I accidentally found this apport log. Please tell if you want the full coredump.

Unless aiglx is explicitly turned off, I find a dual head session that is not in
xinerama crashes the X server every time, a few seconds after both screens are up.

There is no sign of anything wrong - windows can be opened and moved around, or
nothing done at all, then after 5-10s, the server crashes. Happens every time.

This is an i915 in a Dell inspiron laptop.

logs would be useful.

Created an attachment (id=7340)
Xorg log on dual head startup with crash

Here's the log from an X server startup where all looks good until a few
seconds after its up, and then it crashes.

Created an attachment (id=7341)
log from subsequent start-up attempt

This is the log from the next attempt to start the X server.

The second log is similar to your other bug report, and it's only caused because
of the first server crash.

Now, as for the first server crash you are going to have to run the Xserver
under gdb, and make sure you've compiled the Xserver yourself with debugging
flags enabled. Then get a real backtrace of where the crash is happening.

I will try to recompile the server with -g and get a real backtrace.

I filed these separately because they seemed distinct - with AIGLX enabled,
it crashes every time (even immediately after booting), but not till a few
seconds after its started up.

Without AIGLX, it seems to either run ok, or not start up at all.

Is there a trick to starting the X server under gdb?

with:
gdb /usr/bin/X
gdb> run -ignoreABI

The server says:
(==) Using config file: "/etc/X11/xorg.conf"
[tcsetpgrp failed in terminal_inferior: Operation not permitted]
(WW) module ABI major version (0) doesn't match the server's version (1)
(WW) I810: No matching Device section for instance (BusID PCI:0:2:1) found
PIPECONF (1) BEFORE 0x80000000
DSPCNTR (1) BEFORE 0x49000000
PIPECONF (1) AFTER 0x80000000
DSPCNTR (1) AFTER 0xc9000000
I830InitVideo
I830SetupImageVideoOverlay
I830ResetVideo: base: 0xa78f6000, offset: 0xfffa000, obase: 0xb78f0000
Original gamma: 0x80808 0x101010 0x202020 0x404040 0x808080 0xc0c0c0
Bounded gamma: 0x80808 0x101010 0x202020 0x404040 0x808080 0xc0c0c0
I830SetupImageVideoOverlay

and then locks up. The tcsetpgrp line seems to be new.

I was able to attach the debugger after starting up X. When started like this
(directly starting /usr/bin/X), the server starts up fine and stays up. But
when started with startx (configured in gentoo to start a gnome session), it
crashes after a few seconds.

(Compiled with -O2 -g -march=pentium-m)

This brings to mind another bug I didn't file here because it didn't seem to be
a server problem - but it may be a piece of the puzzle:

If I enable xinerama, the screen resolutions seem to get reported all wrong
somewhere - ie, I can't move windows to some parts of the combined screen (lower
and farthest right). But that also only happens if the server is started from
startx to start a gnome session. If its started from gdm or with /usr/bin/X it
seems to behave properly.

A difference though is that if AIGLX is enabled with two heads, non xinerama -
it crashes from startx or from gdm, but not if started with /usr/bin/X.

Back to the issue at hand. With AIGLX enabled without xinerama, the server now
spits out:

Backtrace:
0: X(xf86SigHandler+0x84) [0x80b85e8]
1: [0xffffe420]
2: /usr/lib/xorg/modules/extension/libglx.so [0xb7c65beb]
3: /usr/lib/xorg/modules/extensions/libglx.so(__glXleavServer+0x22) [0xb7c41d52]
4: /usr/lib/xorg/modules/extensions/libglx.so [0xb7c4236e]
5: X(Dispatch+0x19b) [0x8086c0b]
6: X(main+0x488) [0x806e608]
7: /lib/libc.so.6(__libc_start_main+0xd8) [0xb7ce7878]
8: X(FontFileCompleteXLFD+0xad) [0x806d931]

and gdb says:

Program received signal SIGSEGV, Segmentation fault.
0xb7efa4bd in DRIDoBlockHandler (screenNum=1, blockData=0x0, pTimeout=0x0,
    pReadmask=0x0) at dri.c:1399
1399 dri.c: No such file or directory.
        in dri.c
(gdb) bt
#0 0xb7efa4bd in DRIDoBlockHandler (screenNum=1, blockData=0x0, pTimeout=0x0,
    pReadmask=0x0) at dri.c:1399
#1 0xb7c0bbeb in __glXDRIleaveServer () at glxdri.c:146
#2 0xb7be7d52 in __glXleaveServer () at glxext.c:447
#3 0xb7be836e in __glXDispatch (client=0x8507210) at glxext.c:520
#4 0x08086c0b in Dispatch () at dispatch.c:459
#5 0x0806e608 in main (argc=7, argv=0xbfe2d004, envp=0x0) at main.c:447

O.k. so in your sources in xserver/hw/xfree86/dri/dri.c at line 1399 what does
it say ??

line 1399 is the third line of DRIDoBlockHandler:

if (pDRIPriv->pDriverInfo->driverSwapMethod == DRI_HIDE_X_CONTEXT) {

Created an attachment (id=7372)
Set driverprivate to NULL at startup

Can you try the patch I just posted in the previous comment.

It applies to the dri.c file. Make sure you install the new libdri.so that gets
built.

Created an attachment (id=7373)
Set driverprivate to NULL

Oops. Use this one instead as that last patch is bogus.

Created an attachment (id=7374)
Set driverprivate to NULL

Ugh. Now I go grab a coffee. Use this one as the test.

Created an attachment (id=7385)
log with patched dri.c

I'm afraid it still crashes... log attached.

O.k. when in gdb you'll need to print the values of the following when the crash
occurs.

So do...

print pDRIPriv
print pDRIPriv->pDriverInfo
print pDRIPriv->pDriverInfo->driverSwapMethod

and let me know the results are. You'll need to have started X with gdb, rather
than attaching to it.

I can't seem to run the X server from gdb directly - there are actually two
issues. One is if that when I start X from within gdb, it never starts - it
locks up. The second issue is that starting X directly from the command line
doesn't reproduce the crash - it only happens when started by a gnome-session.

Attaching to the process, I find:

Program received signal SIGSEGV, Segmentation fault.
0xb7f0149d in DRIDoBlockHandler (screenNum=1, blockData=0x0, pTimeout=0x0,
    pReadmask=0x0) at dri.c:1404
1404 if (pDRIPriv->pDriverInfo->driverSwapMethod == DRI_HIDE_X_CONTEXT)
{(gdb) print pDRIPriv
$1 = (DRIScreenPrivPtr) 0x0
(gdb) print pDRIPriv->pDriverInfo
$2 = (DRIInfoPtr) 0xf000eec2
(gdb) print pDRIPriv->pDriverInfo->driverSwapMethod
Cannot access memory at address 0xf000ef3e
(gdb)

Created an attachment (id=7393)
check dri private

Try this one.

ok, so now when it crashes:

Program received signal SIGSEGV, Segmentation fault.
0xb7f594b5 in DRIDoBlockHandler (screenNum=1, blockData=0x0, pTimeout=0x0,
    pReadmask=0x0) at dri.c:1417
1417 DRM_SPINUNLOCK(&pDRIPriv->pSAREA->drawable_lock, 1);
(gdb) list
1412 DRI_2D_CONTEXT,
1413
pDRIPriv->partial3DContextStore);
1414 }
1415
1416 if (pDRIPriv->windowsTouched)
1417 DRM_SPINUNLOCK(&pDRIPriv->pSAREA->drawable_lock, 1);
1418 pDRIPriv->windowsTouched = FALSE;
1419
1420 DRIUnlock(pScreen);
1421 }
(gdb)

So I made the obvious change of putting lines 1416-1420 inside:
if (pDRIPriv){

}

and tried again and it no longer crashes.

Thanks for all your help.

I left in the earlier patch setting driverprivate to NULL.

*** Bug 8554 has been marked as a duplicate of this bug. ***

Here is gdb output from a core dump, with xserver-xorg-core-dbgsym installed.
#0 0xb7cbc17b in DRIDoBlockHandler (screenNum=0, blockData=0x0, pTimeout=0x0, pReadmask=0x0) at ../../../../hw/xfree86/dri/dri.c:1432
#1 0xb7c75946 in __glXDRIleaveServer () at ../../../GL/glx/glxdri.c:146
#2 0xb7c51c62 in __glXleaveServer () at ../../../GL/glx/glxext.c:461
#3 0xb7c522fe in __glXDispatch (client=0x829e408) at ../../../GL/glx/glxext.c:534
#4 0x0808690f in Dispatch () at ../../dix/dispatch.c:459
#5 0x0806e715 in main (argc=10, argv=0xbfbede34, envp=0x20) at ../../dix/main.c:479

Tormod Volden (tormodvolden) wrote :
Tormod Volden (tormodvolden) wrote :

Looking at the core dump data, I see that pDRIPriv is null, which is plausible if one looks at the DRI_SCREEN_PRIV macro. I am therefore trying to wrap all the pDRIPriv->... manipulations inside a "if (pDRIProv)" clause.

I also discovered an upstream bug with the same backtrace (and a similar fix reported working) although they talk about dual heads. I have just a simple laptop screen.
https://bugs.freedesktop.org/show_bug.cgi?id=8537

Changed in xorg-server:
status: Unknown → Confirmed
Tormod Volden (tormodvolden) wrote :

I made a debug build with this patch applied, and I am now stress-testing it with a gnomescreensaver-command -a / -d loop. No crash so far. On the other hand I don't see any occurence of this little debug message that I put into the patch. It should appear in Xorg.0.log, right?

O.k. This is already fixed in the git repository for the upcoming Xorg 7.2 release.

Changed in xorg-server:
status: Confirmed → Fix Released

The Edgy install of my parents was reported to crash xorg 80% of the times when starting OpenOffice.org. The backtrace from the xorg log they sent me looks familiar:

Backtrace:
0: /usr/X11R6/bin/X(xf86SigHandler+0x81) [0x80c3971]
1: [0xffffe420]
2: /usr/lib/xorg/modules/extensions/libglx.so [0xb7c3d946]
3: /usr/lib/xorg/modules/extensions/libglx.so(__glXleaveServer+0x22) [0xb7c19c62]
4: /usr/lib/xorg/modules/extensions/libglx.so [0xb7c1a2fe]
5: /usr/X11R6/bin/X(Dispatch+0x18f) [0x808693f]
6: /usr/X11R6/bin/X(main+0x485) [0x806e715]
7: /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xdc) [0xb7d5f8cc]
8: /usr/X11R6/bin/X(FontFileCompleteXLFD+0xa1) [0x806da51]

This is a Matrox G450 AGP with two heads, but only one is used.

Is this the same bug?

Magnes (magnesus2) wrote :

I have the same problem. Edgy, xserver-xorg-core 1.1.1ubuntu12

xorg log:
Backtrace:
0: /usr/X11R6/bin/X(xf86SigHandler+0x81) [0x80c3971]
1: [0xffffe420]
2: /usr/lib/libXfont.so.1(FontFileAddFontFile+0x252) [0xb7f180a2]
3: /usr/lib/libXfont.so.1(FontFileReadDirectory+0x83f) [0xb7f169df]
4: /usr/lib/libXfont.so.1(FontFileInitFPE+0x2f) [0xb7f1aa8f]
5: /usr/X11R6/bin/X [0x8087d68]
6: /usr/X11R6/bin/X(SetDefaultFontPath+0x79) [0x8087e69]
7: /usr/X11R6/bin/X(SetFontPath+0x2f) [0x8087ebf]
8: /usr/X11R6/bin/X(main+0x39e) [0x806e62e]
9: /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xdc) [0xb7d958cc]
10: /usr/X11R6/bin/X(FontFileCompleteXLFD+0xa1) [0x806da51]

Fatal server error:
Caught signal 8. Server aborting

It crashes in random moments, several times a day.

Tormod Volden (tormodvolden) wrote :

I backported the upstream fix to edgy, and you can download a test package here (it's a debug build):
http://tormod.webhop.org/linux/xorg-server/
Can Martijn and Magnes try it out? Otherwise I expect it to be fixed in xorg 7.2 which will release soon, and eventually enter feisty.

Martijn Vermaat (mvermaat) wrote :

I will try to reproduce and test, but it's not my machine and I think meanwhile I changed something in the configuration that made this bug disappear (I seem to remember it was disabling direct rendering). I'll try to check through my parents.

Martijn Vermaat (mvermaat) wrote :

After re-enabling DRI the bug could again be reproduced. I then compiled my own package with the same patches and we were not able to reproduce the bug anymore. So I think your backport fixes this. Thanks!

Tormod Volden (tormodvolden) wrote :

For the record, here is the one patch I used in my latest build.

Magnes (magnesus2) wrote :

Aflter installing xserver-xorg-core_1.1.1-0ubuntu12tormod2_i386.deb (or the tormod version) XOrg crashes almost immediately after start. I will attach log soon.

Magnes (magnesus2) wrote :

OK. I managed to run tormod2 version of xserver by reinstalling nvidia driver after installing it. I'll test if it's stable for several days and let you know.

Magnes (magnesus2) wrote :

My XOrg is stable now. No more crashes. The patch seems to work.

Tormod Volden (tormodvolden) wrote :

I suggest to have this backported or SRU'ed to Edgy, since it seems to touch several brands of cards. Usually the xserver does its own error trapping, so these crashes are not reported by apport.

Tormod Volden (tormodvolden) wrote :

Magnes, the "tormod2" version was a debug build without optimization. I rebuilt a normal version (based on the newer ubuntu12.1 version) with the patch from comment #13. You can download it from the same place, I called it *ubuntu12-1tormod_i386.deb

Of course, if you experience any crashes again, please install the debug version so that we can get useful stack traces.

Magnes (magnesus2) wrote :

It's a little strange. The normal version works but my system still isn't stable. I'm not sure if it's related to XOrg though - I need to give it more test. It may be Beryl related or truecrypt related or it may be sth with my hard drive configuration.
Sometimes XOrg (and sometimes even the whole system) crashes when I copy files (copying is slowing the system also when I copy files not from crypted partitions, but from my system 320GB seagate hard drive, so it's probably not truecrypt fault) - now it crashes only when I do many things at one time (for example start several programs at once or copy files in two applications, especially with Krusader which slows my system).
Although it's not that often and random as it was before I was using the patched version of xorg-server and there is NOTHING in the xorg log (as it hasn't crashed at all but was closed normally). Also my Firefox is crashing a lot. It's a little wearing. :(
My old Dapper installation on other partition (other hard drive) works fine.
(sorry for my English)

Magnes (magnesus2) wrote :

By writing that there is nothing in xorg log I ment that there is no indication of a crash - the old log looks exacly like the new log when I restart xorg.

Tormod Volden (tormodvolden) wrote :

And if you disable DRI you never get crashes, is that correct?

Tormod Volden (tormodvolden) wrote :

My patched versions does only address the issue that creates these segfaults with the above backtrace. If you don't get any backtrace any longer this issue is probably solved with the patch. Unfortunately there are many other bugs related to DRI in Xorg. What kind of graphics card do you have, by the way?

Magnes (magnesus2) wrote :

I suppose the patch works, because there are no more backtraces like those in the first posts. I have nvidia card (6200A), I will try to install newer drivers, maybe that will help (and read forum on nvidia.com, maybe there is some issue with the drivers). Thanks for the patch. It looks like this bug is solved but (as I stated above) my system is not in good shape, so I can't tell that for sure.

Tormod Volden (tormodvolden) wrote :

Confirmed on Matrox G450 AGP and Nvidia 6200A by comments, in addition to my S3 Savage TwisterK.

Changed in xorg-server:
status: Unconfirmed → Confirmed
Martin Pitt (pitti) wrote :

Please get this fixed and tested widely in Feisty first before requesting an SRU.

Tormod Volden (tormodvolden) wrote :

Makes sense. Is there something we can do to have this fixed in Feisty as soon as possible? (I am running the patched Edgy version in Feisty meanwhile.)

Magnes (magnesus2) wrote :

My system is very stable now with (also patched Edgy), other problems that I had were caused by bad CPU cooling. The patch works flawlessly.

Let's try to get this into feisty, while they decide on 7.2?

Changed in xorg-server:
importance: Undecided → High

see debdiff.

patch seems to work fine here.

Kees Cook (kees) wrote :

I've uploaded this fix for feisty. Thanks for tracking down the patch for it!

Changed in xorg-server:
status: Confirmed → Fix Released
importance: Undecided → Medium
status: Unconfirmed → In Progress
Martin Pitt (pitti) wrote :

So, does anyone want to take care of the Edgy SRU? Please set a proper assignee and submit an SRU proposal according to the Policy.

Unsubscribing SRU team until then.

Matt Price (matt-price) wrote :

I'm getting what i believe is this same bug in an almost-up-to-date feisty (last update 8 march 2007). xorg dies with this in Xorg.0.log.old:
----
Backtrace:
0: /usr/bin/X(xf86SigHandler+0x81) [0x80c5d81]
1: [0xb7fbe420]

Fatal server error:
Caught signal 11. Server aborting
----
was dying mysteriously at 11 minute intervals, apparently becaue the screensaver was turned on (i switched to 'random' from 'display photos', and promptly forgot i'd done so). I can now reliably produce the crash by trying to start beryl, googleearth, various screensaver, glxinfo, or any other 3d programs. i'm using
xserver-xorg-core 2:1.2.0-3ubuntu3
nvidia 9746 (using nvidia installer)
millions of other possible packages.

3d worked fine on this computer for about 2 months until about 2 weeks ago (time frames approximate); maybe this is a regression in recent xorg releases?

would like to include apport bug traces but not sure how to do that manually; please suggest a method if you think such a trace would be of use.

forum activity in the last couple of weeks suggest the bug may still persist for other users too:

Matt Price (matt-price) wrote :

also i see some suggestions that the real upsteam bug for this issue may be:
https://bugs.freedesktop.org/show_bug.cgi?id=1753

matt

Timo Aaltonen (tjaalton) wrote :

I'm reopening this one for Feisty... The fix in that upstream bug will be applied to our xorg-server-1.2.

Changed in xorg-server:
status: Fix Released → In Progress
Timo Aaltonen (tjaalton) wrote :

hm, can't mark another upstream bug of the same component, oh well..

Timo Aaltonen (tjaalton) wrote :

Matt: can you test this deb:

http://users.tkk.fi/~tjaalton/xorg72/new/xserver-xorg-core_1.2.0-3ubuntu4_i386.deb

It should be on the archive by Monday, but just to be sure that it actually fixes something :) Here's the debdiff:

http://users.tkk.fi/~tjaalton/xorg72/new/xorg-server.debdiff

Timo Aaltonen (tjaalton) wrote :

Marking as fixed. Matt, if you still have problems, please file another bug.

Changed in xorg-server:
status: In Progress → Fix Released

hi timo, sorry not to have responded, but want to report that my
problems went away miraculously when i upgraded the nvidia driver --
this was before your recent update, so wasn't able to test the upgrade
as i was no longer able to trigger the crash. if i get similar results
again i'll file another bug. thanks so much for the swift update!

matt

On Thu, 2007-22-03 at 15:59 +0000, Timo Aaltonen wrote:
> Marking as fixed. Matt, if you still have problems, please file another
> bug.
>
> ** Changed in: xorg-server (Ubuntu)
> Status: In Progress => Fix Released
>
--
Matt Price
History Dept
University of Toronto
<email address hidden>

20after4 (twentyafterfour) wrote :

I'm having the same crash (when running anything that uses GLX)
 since the latest xorg update in edgy-security (1.1.1-0ubuntu12.2)

Backtrace:
 0: /usr/X11R6/bin/X(xf86SigHandler+0x81) [0x80c3971]
 1: [0xffffe420]

Fatal server error:
 Caught signal 11. Server aborting

-------------------------------------------------------------------------
This is with NVidia "restricted" drivers, worked fine before the update. I forced a "downgrade" to the old version and I still get the crash.
I'm going to try xserver-xorg-core_1.1.1-0ubuntu12.2tormod_i386.deb and/or reinstalling nvidia drivers. I will report back with the result of my testing.

20after4 (twentyafterfour) wrote :

ok, reinstalling the nvidia driver did the trick. It detected some files that were apparently overwritten by the recent xorg update from Ubuntu / Edgy Security. After installing the nvidia glx files everything seems to be back to normal.

Mircea Deaconu (mirceade) wrote :

Same problem on Feisty / Ati restricted drivers / Compiz. When a 3D screen-saver (GL Matrix) starts the X-server crashes and throws me back to the login screen.

Tormod Volden (tormodvolden) wrote :

If you don't have "glXleaveServer" in the backtrace, it's not useful to post in this bug (which has been closed). Please file a new bug.

It's not applied to 1.3 branch, master or elsewhere.

Yes it is.

If you've got a problem I suggest you open a new bug report.

Ok, show me the commit? The attached patch has been in ubuntu against 1.2 and applied just fine, ditto for 1.3 so if it has been fixed by other means then I'd like to know about it.

The patch here isn't needed as it's fixed elsewhere in the GLX code.

If you are experiencing a crash with X.Org 7.2 or later then I suggest you log a new bug with details.

No, that's all I needed to know, thanks!

Hi..
Sorry for my bad english........

I have a ATI RADEON 9600 with restricted driver installed.
I use BERYL and it work fine but my X sometimes crash without advise and session restart from the login window.

Where i can read cause of restart?

Please help my..
Denis

Tormod Volden (tormodvolden) wrote :

Denis, look in /var/log/Xorg.0.log (or /var/log/Xorg.0.log.old if an X session has restarted). Please file a new bug for your problem.

This is still an issue in gutsy, xorg 7.3

I'd guess the patch has been dropped.

Changed in xorg-server:
status: Fix Released → Confirmed
Changed in xorg-server:
status: In Progress → Confirmed
Bryce Harrington (bryce) wrote :

We can re-apply the same patch, if we can verify it's the same bug. Sarah, can you crash X and post a fresh backtrace from Xorg.0.log (or Xorg.0.log.old) showing the backtrace? It sounds like this patch is relevant only if "glXleaveServer" is present in the backtrace. If it isn't, then this is a new bug we should report upstream ASAP.

Bryce Harrington (bryce) wrote :

Sarah,

I've re-rolled a new xorg-server package with this patch re-enabled. If you can confirm that the backtrace is showing "glXleaveServer", please try this new xorg-server and verify that it resolves the problem, and we'll push this out into Gutsy.

Thanks,
Bryce

Changed in xorg-server:
status: Confirmed → Needs Info

Attempting to make it crash again.

This seems to be like pets and children - they wont show the symptoms when there's someone around to help them.

grrr.

Will respond back when I find something, though.

I cant seem to reproduce this. Marking as fixed again.

Changed in xorg-server:
status: Incomplete → Fix Released
Timo Aaltonen (tjaalton) on 2008-01-25
Changed in xorg-server:
status: Confirmed → Won't Fix
Changed in xorg-server:
importance: Unknown → Medium
Changed in xorg-server:
importance: Medium → Unknown
Changed in xorg-server:
importance: Unknown → Medium
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.