Both types of display list crash client during play.

Bug #245925 reported by Monkey
2
Affects Status Importance Assigned to Milestone
Armagetron Advanced
Fix Committed
Low
Manuel Moos
0.2.8
Fix Committed
High
Manuel Moos
Mesa
Invalid
Critical

Bug Description

Error:
======

Arma client crashes after only a few seconds of play on any server.

Error message:
==============

"armagetronad-0.2.8-latest_alpha: tnl/t_draw.c:203: bind_inputs: Assertion `inputs[i]->BufferObj->Pointer' failed.
Aborted"

Client settings responsible:
============================

It crashes with display lists set to "Create and Call" OR "Create and Execute".
It does not crash with display lists turned "Off".

Most recent client versions I have tested and found bug in:
===========================================================

0.2.8_alpha20080801, 0.3_alpha20080801

My system:
==========

X.Org X Server 1.4.2
Release Date: 11 June 2008
X Protocol Version 11, Revision 0
Build Operating System: Slackware 12.1 Slackware Linux Project
Current Operating System: Linux slackware 2.6.24.5-smp Apr 30 2008 (i686)
Build Date: 30 June 2008
Graphics card: ATI Radeon 9200
Graphics card driver: xf86-video-ati-6.9.0
GL_RENDERER: Mesa DRI R200 20060602 AGP 4x x86/MMX+/3DNow!+/SSE TCL
GL_VENDOR: Tungsten Graphics, Inc.
GL_VERSION: 1.3 Mesa 7.0.2

I have also tried it on my older motherboard with onboard Intel i810 graphics, with the same crash and error.

Ask me more questions if needed.

Monkey (monkey-arma)
description: updated
description: updated
Monkey (monkey-arma)
description: updated
description: updated
Monkey (monkey-arma)
description: updated
Yann Kaiser (epsy)
description: updated
Monkey (monkey-arma)
description: updated
Monkey (monkey-arma)
description: updated
Monkey (monkey-arma)
description: updated
Revision history for this message
Manuel Moos (z-man) wrote :

That assertion failure looks like an internal error of the Mesa driver, you should take it to them. I'll blacklist "Mesa DRI" for now (after testing whether software rendering is affected, too).

What are the GL_* strings for the Intel chip?

Revision history for this message
Manuel Moos (z-man) wrote :

Software rendering via Mesa is also affected. It just takes a bit longer to manifest itself.

Revision history for this message
In , Monkey (monkey-arma) wrote :

This happens on a game called Armagetron Advanced. All of the information you should need is here:

https://bugs.launchpad.net/armagetronad/+bug/245925

Note that I had issues trying to install a newer version of Mesa but I checked to see if this bug had already been reported and it did not seem to have been.

Revision history for this message
Monkey (monkey-arma) wrote :

I have just tested the i810 onboard graphics chip of my old motherboard, with identical settings and system as stated above.
Exactly the same thing happens with exactly the same error (error is again for both "Create and Call" AND "Create and Execute").

Display driver: xf86-video-i810-1.7.4
GL_RENDERER: Mesa DRI i810 20050821 x86/MMX/SSE
GL_VENDOR: Keith Whitwell
GL_VERSION: 1.2 Mesa 7.0.2

Revision history for this message
In , Michel-tungstengraphics (michel-tungstengraphics) wrote :

I can't reproduce this on a Debian system with the r300 driver, neither with a self-built pre-7.1 Git snapshot nor with the Debian 7.0.3 packages (which are actually based on a post 7.0.3 snapshot of the Git mesa_7_0_branch). Can you try a snapshot from either of these Git branches?

Note that Debian's version of armagetronad only knows 'On' or 'Off' for display lists, but I assume 'On' corresponds to one of the options that fail for you.

Revision history for this message
Monkey (monkey-arma) wrote :

I have now reported this bug to the Mesa developers, using Bugzilla.

Revision history for this message
Monkey (monkey-arma) wrote :

I have just tested the S3 ProSavage8 onboard graphics chip on my currennt motherboard, with identical settings and system as stated above.
Exactly the same thing happens with exactly the same error (error is again for both "Create and Call" AND "Create and Execute").

GL_RENDERER: Mesa DRI ProSavageDDR 20061110 AGP 1x x86/MMX+/3DNow!+/SSE
GL_VENDOR: S3 Graphics Inc.
GL_VERSION: 1.2 Mesa 7.0.2

By the way if anyone wants to track the bug report on Bugzilla, the link is:
https://bugs.freedesktop.org/show_bug.cgi?id=16984

Revision history for this message
In , Yann Kaiser (epsy) wrote :

You will need a recent version in order to reproduce it, the stable version of arma doesn't reproduce the bug.

To grab a copy of a more recent arma which reproduces on Monkey's hardware, use bzr co lp:armagetronad/0.2.8
then use the regular build process: ./bootstrap.sh && ./configure && make && make run

Revision history for this message
In , Monkey (monkey-arma) wrote :

Early versions of bzr (bazaar) may not have all of the functionality you need (Debian stable I think is too old). According to their site
http://bazaar-vcs.org/Download the current stable version should be available in apt repository (I would guess in testing or unstable).

Revision history for this message
In , Michel-tungstengraphics (michel-tungstengraphics) wrote :

(In reply to comment #2)
> To grab a copy of a more recent arma which reproduces on Monkey's hardware, use
> bzr co lp:armagetronad/0.2.8

Done, still can't seem to reproduce with this with neither the Mesa setups I mentioned and with neither the armagetronad settings mentioned. So it would definitely be good if you could try a newer Mesa snapshot or at least the 7.0.3 release.

Revision history for this message
In , Monkey (monkey-arma) wrote :

https://bugs.launchpad.net/armagetronad/+bug/245925 updated.

Also, track this thread as it grows: http://forums.armagetronad.net/viewtopic.php?t=18264&sid=a162d27d3661873b90576883119cdf9f
Note that "hoop" also has this problem. He uses Ubuntu and also he seems to be on Mesa 7.0.3-rc2. Some had no problems with versions of Mesa as low as mine.

If you are going to test then make sure you play on a server over the network with as many people on as possible. It took me 15 minutes sometimes to get the bug today. It also seems that maybe if you reduce your detail settings and preferences it may happen more quickly?

Finally, for me to try to install newer versions of Mesa (which I did try) I discovered I will need Xorg 1.5 and things like that which I dont have and would be nightmare for me (I'm no expert yet).

Revision history for this message
Monkey (monkey-arma) wrote :

1) I found out that this was commented out in my xorg.conf file so I uncommented it:

Section "Module"
...
Load "dri"
...
Section "DRI"
    Mode 0666
...

However, the bug still happens exactly as before, just it takes longer to manifest.

2) Setting display settings to defaults (I usually play with lower details) seemed to make the bug take even longer to manifest. Sometimes it took upto 15 minutes to happen.

Revision history for this message
In , Michel-tungstengraphics (michel-tungstengraphics) wrote :

Okay, I guess I just haven't tried hard enough to reproduce it.

Please attach a gdb backtrace from the assertion failure, preferably with Mesa debugging symbols.

Revision history for this message
In , Monkey (monkey-arma) wrote :

(In reply to comment #6)

For some reason, when running via gdb, Armagetron Advanced only crashes at the end of the round of play.

Anyway, here is a backtrace (I'm not sure how to get the Mesa debugging symbols yet).Create and Call and Create and Execute produce the same backtrace (other than the hex numbers):

#0 0xb7994c66 in raise () from /lib/libc.so.6
#1 0xb7996571 in abort () from /lib/libc.so.6
#2 0xb798de60 in __assert_fail () from /lib/libc.so.6
#3 0xb7582514 in _tnl_draw_prims () from /usr/lib/xorg/modules/dri/r200_dri.so
#4 0xb75817b9 in vbo_save_playback_vertex_list () from /usr/lib/xorg/modules/dri/r200_dri.so
#5 0xb7520493 in ?? () from /usr/lib/xorg/modules/dri/r200_dri.so
#6 0x08302e98 in ?? ()
#7 0x087e4738 in ?? ()
#8 0xbfa01608 in ?? ()
#9 0xb75203ee in ?? () from /usr/lib/xorg/modules/dri/r200_dri.so
#10 0x085dc970 in ?? ()
#11 0x087e4734 in ?? ()
#12 0x08302e98 in ?? ()
#13 0x00a98f7c in ?? ()
#14 0x000000a1 in ?? ()
#15 0xb7ab3128 in main_arena () from /lib/libc.so.6
#16 0xb7ab1ff4 in ?? () from /lib/libc.so.6
#17 0xb7a98f7c in ?? () from /lib/libc.so.6
#18 0xb7ab1ff4 in ?? () from /lib/libc.so.6
#19 0x00000000 in ?? ()

Revision history for this message
In , Jkrahn-nc (jkrahn-nc) wrote :

(In reply to comment #0)
> This happens on a game called Armagetron Advanced. All of the information you
> should need is here:
>
> https://bugs.launchpad.net/armagetronad/+bug/245925
>
> Note that I had issues trying to install a newer version of Mesa but I checked
> to see if this bug had already been reported and it did not seem to have been.
>

The location of the crash is easy to find from the debug message. I have come across this occasionally on software for which I do not have the source code. The problem seems to come from NULL vertex (or other) triangle array pointer compiled into a display list. Maybe this occurs in the context of a "zero sized" object where the relevant pointer data is not actually referenced. If someone has the source code where this can be reproduced, maybe you can add a some checks to array pointer calls to look for NULLs.

Revision history for this message
Manuel Moos (z-man) wrote :

Epsy recorded one of those crashes while tracing OpenGL calls with bugle. Attached is:
- the recording
- the IRC chat log of the report
- the tail of the bugle log (it's big)
- the processed bugle log (run through batch/checkbugle.py, which verifies that display list generations and calls are legit and don't use undefined lists, and sums up all calls used inside display lists and glBegin/glEnd blocks)
- the crash backtrace

The logs show no sign that anything out of spec is going on. I'm pretty sure now the problem is with Mesa.

All this was around bzr revision 926.

I somehow can't reproduce this any more with software Mesa. Too much stuff has changed on my systems, it seems.

Revision history for this message
Manuel Moos (z-man) wrote :

Hah! In a virtual machine with Kubuntu 8.10 and various Mesa versions installed from source, I got a recording that
- crashes after 15 seconds with Mesa 7.0.2
- runs through cleanly with Mesa 7.1, 7.2 and 6.5.3 (the last of series 6)
All with the same binary of Armagetronad, just the Mesa libraries were exchanged.

So, at least as far as software rendering is concerned, the lies with the 7.0.x series of Mesa, and it has been fixed by now. Please try to upgrade your version of Mesa to at least 7.1 and try again.

Revision history for this message
Manuel Moos (z-man) wrote :

Yeah, of course, seconds after the last comment my long term test of 7.1 crashed. So scratch that.

Revision history for this message
In , Manuel-moosnet (manuel-moosnet) wrote :

(Another Armagetron Advanced Developer reporting)

We're using custom C++ array classes to feed pointers to OpenGL; they always have a valid data pointer, unlike most STL implementations. Passing NULL pointers into OpenGL is definitely not the problem. I've been using BuGLe to trace our OpenGL usage recently, analyzing the resulting call traces with a small Python program for anomalies; the few (harmless ones; like calling glDeleteLists while recording another list, which should be legal) have been eliminated. glBegin/End pairs definitely match, and all commands inside display lists are legal, no undefined display lists are used.

What *may* be the problem is that we're using display list as geometry caches in the new alpha versions (not in previous stable releases, there we only use them for static geometry). We have lots of geometry that is in principle dynamic, but doesn't change often on a frame-to-frame basis; so to optimize rendering, we put it into display lists and only update the lists when something changes. The way we do this can result in *empty* display lists from time to time.

Anyway, at least for software rendering, it seems Mesa 7.2 is no longer afffected by this problem: https://bugs.launchpad.net/armagetronad/+bug/245925/comments/10
I'm asking the DRI users to verify this.

Assuming 7.2 turns out to be clean for them, too: If you still want us to get down to the root of the issue for 7.0.x and 7.1, I can reproduce the crash there rather reliably now with the debug recording attached to the linked comment.

Revision history for this message
Manuel Moos (z-man) wrote :

Ok, but 7.2 definitely is unaffected. It ran the test settings crashing 7.1 after a couple of minutes through the whole night, and 7.1 crashes on playback of the resulting recording.

To get the crash, be sure to check out bzr revision 944/svn revision 8649 or earlier (later revisions just won't use display lists for affected Mesa versions) and compile with DEBUGLEVEL >= 1. Compilers that don't break the recording are GCC 4.3.2 from Kubuntu 8.10 and 4.2.3 from Kubuntu 8.4.

If you can, get the recording, get a version of Arma that can play it back, use it to get the crash, then upgrade your Mesa to 7.2 and try again.

Manuel Moos (z-man)
Changed in armagetronad:
assignee: nobody → z-man
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Manuel Moos (z-man) wrote :

Seems like blacklisting the affected versions is all we can do.

Changed in armagetronad:
status: Confirmed → Fix Committed
Changed in mesa:
importance: Undecided → Unknown
status: New → Unknown
Changed in mesa:
status: Unknown → Confirmed
Revision history for this message
In , Idr (idr) wrote :

This bug is against fairly old versions of Mesa that are no longer supported. Is this issue reproducible on more recent versions? The last comment seems to indicate that it may have been fixed in 7.2.

Revision history for this message
In , Idr (idr) wrote :

A month with no reply. Closing.

Changed in mesa:
status: Confirmed → Invalid
Changed in mesa:
importance: Unknown → Critical
Changed in mesa:
importance: Critical → Unknown
Changed in mesa:
importance: Unknown → Critical
Revision history for this message
Luke-Jr (luke-jr) wrote :

Probably doesn't matter, but "Display Lists" is under "Performance Tweaks" which are unsupported hacks that aren't necessarily *supposed* to work...

Changed in armagetronad:
importance: High → Low
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.