(needs 9.1.3) r600_dri.so crashes in r600_texture_create_object (memset)

Bug #1179539 reported by Marco Trevisan (Treviño)
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Mesa
Fix Released
Medium
mesa (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

This is similar to bug #1174495 since it is mostly caused by WM, but it could actually be caused also by other applications so we need to include a proper fix in Mesa as well.

The crash trace is at http://pastebin.ubuntu.com/5661477/ and is currently fixed in MESA by commit http://cgit.freedesktop.org/mesa/mesa/commit/?id=b6920764207

Revision history for this message
In , Eugene Shalygin (eugene-shalygin) wrote :
Download full text (5.1 KiB)

Kwin crashes with current mesa and kernel 3.8. It does not crash with 3.7 kernel. With mesa 9.0.2 and kernel 3.8 kwin also works fine

#5 0x0000003647c892ab in __memset_sse2 () from /lib64/libc.so.6
#6 0x00007f5a8cbf6cf3 in r600_texture_create_object (screen=0x980ca0, base=0x7fff208b4770, pitch_in_bytes_override=0, buf=0x0, surface=0x7fff208b3a80) at r600_texture.c:509
#7 0x00007f5a8cbf7339 in r600_texture_create (screen=0x980ca0, templ=0x7fff208b4770) at r600_texture.c:601
#8 0x00007f5a8cbdbae5 in r600_resource_create (screen=0x980ca0, templ=0x7fff208b4770) at r600_resource.c:37
#9 0x00007f5a8cc1b4d2 in dri2_drawable_process_buffers (drawable=0xc2b130, buffers=0x764460, buffer_count=1, atts=0x7fff208b48b0, att_count=1) at dri2.c:254
#10 0x00007f5a8cc1b9b3 in dri2_allocate_textures (drawable=0xc2b130, statts=0x7fff208b48b0, statts_count=1) at dri2.c:404
#11 0x00007f5a8cc19f0f in dri_st_framebuffer_validate (stfbi=0xc2b130, statts=0x7fff208b48b0, count=1, out=0x0) at dri_drawable.c:81
#12 0x00007f5a8cc1a30e in dri_drawable_validate_att (drawable=0xc2b130, statt=ST_ATTACHMENT_FRONT_LEFT) at dri_drawable.c:206
#13 0x00007f5a8cc1a35c in dri_set_tex_buffer2 (pDRICtx=0x8fef20, target=3553, format=8409, dPriv=0xbee5e0) at dri_drawable.c:220
#14 0x00007f5a9b82f590 in dri2_bind_tex_image (dpy=0x7d15f0, drawable=29360171, buffer=8414, attrib_list=0x0) at dri2_glx.c:1002
#15 0x00007f5a9b7f30f5 in __glXBindTexImageEXT (dpy=0x7d15f0, drawable=29360171, buffer=8414, attrib_list=0x0) at glxcmds.c:2370
#16 0x00007f5a9d1773c3 in KWin::GlxTexture::loadTexture(unsigned long const&, QSize const&, int) () from /usr/lib64/libkdeinit4_kwin.so
#17 0x00007f5a9d16f175 in KWin::SceneOpenGL::Window::bindTexture() () from /usr/lib64/libkdeinit4_kwin.so
#18 0x00007f5a9d175a8e in KWin::SceneOpenGL::Window::performPaint(int, QRegion, KWin::WindowPaintData) () from /usr/lib64/libkdeinit4_kwin.so
#19 0x00007f5a9d16e32f in KWin::SceneOpenGL2::performPaintWindow(KWin::EffectWindowImpl*, int, QRegion, KWin::WindowPaintData&) () from /usr/lib64/libkdeinit4_kwin.so
#20 0x00007f5a9d16e4cd in KWin::SceneOpenGL2::finalDrawWindow(KWin::EffectWindowImpl*, int, QRegion, KWin::WindowPaintData&) () from /usr/lib64/libkdeinit4_kwin.so
#21 0x00007f5a9d1822b5 in KWin::EffectsHandlerImpl::drawWindow(KWin::EffectWindow*, int, QRegion, KWin::WindowPaintData&) () from /usr/lib64/libkdeinit4_kwin.so
#22 0x00007f5a9d16178a in KWin::Scene::finalPaintWindow(KWin::EffectWindowImpl*, int, QRegion, KWin::WindowPaintData&) () from /usr/lib64/libkdeinit4_kwin.so
#23 0x00007f5a9d182557 in KWin::EffectsHandlerImpl::paintWindow(KWin::EffectWindow*, int, QRegion, KWin::WindowPaintData&) () from /usr/lib64/libkdeinit4_kwin.so
#24 0x00007f5a9d16434d in KWin::Scene::paintWindow(KWin::Scene::Window*, int, QRegion, KWin::WindowQuadList) () from /usr/lib64/libkdeinit4_kwin.so
#25 0x00007f5a9d16367f in KWin::Scene::paintSimpleScreen(int, QRegion) () from /usr/lib64/libkdeinit4_kwin.so
#26 0x00007f5a9d1616ce in KWin::Scene::finalPaintScreen(int, QRegion, KWin::ScreenPaintData&) () from /usr/lib64/libkdeinit4_kwin.so
#27 0x00007f5a9d182720 in KWin::EffectsHandlerImpl::paintScreen(int, QRegion, KWin::ScreenPaintD...

Read more...

Revision history for this message
In , Eugene Shalygin (eugene-shalygin) wrote :

This happens only when kwin starts for the first time. There are no crashes at next runs.

Revision history for this message
In , Eugene Shalygin (eugene-shalygin) wrote :

Same crash with the same backtrace using mesa 9.1 and kernel 3.8

Revision history for this message
In , Eugene Shalygin (eugene-shalygin) wrote :

With "glamor" as acceleration method, these crashes are much more rare

Revision history for this message
In , Knut-tidemann (knut-tidemann) wrote :

I can confirm a very similar crash:

#5 0x00007fb122be859b in __memset_sse2 () from /usr/lib/libc.so.6
#6 0x00007fb072bd0a5f in r600_texture_create_object () from /usr/lib/xorg/modules/dri/r600_dri.so
#7 0x00007fb072bd0d37 in r600_texture_create () from /usr/lib/xorg/modules/dri/r600_dri.so
#8 0x00007fb072be3145 in dri2_allocate_textures () from /usr/lib/xorg/modules/dri/r600_dri.so
#9 0x00007fb072be1c65 in dri_st_framebuffer_validate () from /usr/lib/xorg/modules/dri/r600_dri.so
#10 0x00007fb072be1e9e in dri_set_tex_buffer2 () from /usr/lib/xorg/modules/dri/r600_dri.so
#11 0x00007fb122fd94b3 in ?? () from /usr/lib/libkdeinit4_kwin.so

(I don't have debug symbols for kwin, and mesa was compiled in release mode).

Mesa 9.2-devel (git-4154ac0)
Kernel 3.8.0.
KDE 4.10.

I get the crash when first logging in, but I also get it every time I unlock the screen after locking it from the 'K-menu'.

The crash only occurs if compositing is on.

Revision history for this message
In , Knut-tidemann (knut-tidemann) wrote :

I forgot to add:

My card is a Radeon HD 5670

Revision history for this message
In , Eugene Shalygin (eugene-shalygin) wrote :

The same crash happens time to time when maximizing application windows

Revision history for this message
In , Eugene Shalygin (eugene-shalygin) wrote :

it is quite interesting to note that the crash happens only using 1920x1080 and higher screen resolution. If the resolution is lower, kwin works fine.

Revision history for this message
In , Jürg Billeter (j-bitron) wrote :

I don't have a usable backtrace at hand, but I'm seeing very similar crashes in gnome-shell as well with Mesa 9.1 and Linux 3.8.3. With Mesa 9.0.3, I haven't noticed any crashes so far. It crashes much more frequently with two monitors (both 1920x1200) attached than with just one.

This is on an x86-64 system with Radeon HD 4770.

Revision history for this message
In , agd5f (agd5f) wrote :

(In reply to comment #8)
> I don't have a usable backtrace at hand, but I'm seeing very similar crashes
> in gnome-shell as well with Mesa 9.1 and Linux 3.8.3. With Mesa 9.0.3, I
> haven't noticed any crashes so far. It crashes much more frequently with two
> monitors (both 1920x1200) attached than with just one.

Can you bisect mesa?

Revision history for this message
In , Jürg Billeter (j-bitron) wrote :

(In reply to comment #9)
> (In reply to comment #8)
> > I don't have a usable backtrace at hand, but I'm seeing very similar crashes
> > in gnome-shell as well with Mesa 9.1 and Linux 3.8.3. With Mesa 9.0.3, I
> > haven't noticed any crashes so far. It crashes much more frequently with two
> > monitors (both 1920x1200) attached than with just one.
>
> Can you bisect mesa?

I don't have a reliable trigger for the crash yet. If I can find one, I'll give it a try.

Revision history for this message
In , Michael Mair-Keimberger (bu9zilla) wrote :

I don't know if its the same but i can confirm similar crashes.

First of all, i have a triple head setup, 2 screens with 1920x1200 (left and right) and one with 2560x1600 (middle). Those crashes happen since kernel 3.8 (i've used 3.8.[1-4]) and mesa-9.1 but i don't know which one actually introduced the problem as i both updated at the same time..
Furthermore they are no crashes at all if i disable desktop effects in kde.

Basically i only have to create an new user and login into it. Kde usually sets the resolution of all screens to 1920x1200 with enabled desktop effect. I can actually move the screens back and forth (because the ordering is wrong too with the first start) without problems, but as soon as i'm changing the resolution of the middle screen X crashes.
Without effects i won't get any crashes.

If you need more info, please let me know.

Below are the relevant parts of Xorg.0.log (even though without debug symbols i think its quite useless ;) ):

[ 1671.397] (EE)
[ 1671.397] (EE) Backtrace:
[ 1671.397] (EE) 0: /usr/bin/X (xorg_backtrace+0x36) [0x5860f6]
[ 1671.397] (EE) 1: /usr/bin/X (0x400000+0x189e09) [0x589e09]
[ 1671.397] (EE) 2: /lib64/libpthread.so.0 (0x7ff6d65e0000+0x10ff0) [0x7ff6d65f0ff0]
[ 1671.397] (EE) 3: /lib64/libc.so.6 (0x7ff6d5238000+0x8a6fb) [0x7ff6d52c26fb]
[ 1671.397] (EE) 4: /usr/lib64/dri/r600_dri.so (0x7ff6d2438000+0x5175c7) [0x7ff6d294f5c7]
[ 1671.397] (EE) 5: /usr/lib64/dri/r600_dri.so (0x7ff6d2438000+0x517a87) [0x7ff6d294fa87]
[ 1671.398] (EE) 6: /usr/lib64/dri/r600_dri.so (0x7ff6d2438000+0x52a0dd) [0x7ff6d29620dd]
[ 1671.398] (EE) 7: /usr/lib64/dri/r600_dri.so (0x7ff6d2438000+0x528c4d) [0x7ff6d2960c4d]
[ 1671.398] (EE) 8: /usr/lib64/dri/r600_dri.so (0x7ff6d2438000+0x528e86) [0x7ff6d2960e86]
[ 1671.398] (EE) 9: /usr/lib64/xorg/modules/extensions/libglx.so (0x7ff6d44d8000+0x4541a) [0x7ff6d451d41a]
[ 1671.398] (EE) 10: /usr/lib64/xorg/modules/extensions/libglx.so (0x7ff6d44d8000+0x39941) [0x7ff6d4511941]
[ 1671.398] (EE) 11: /usr/lib64/xorg/modules/extensions/libglx.so (0x7ff6d44d8000+0x3a183) [0x7ff6d4512183]
[ 1671.398] (EE) 12: /usr/lib64/xorg/modules/extensions/libglx.so (0x7ff6d44d8000+0x3c73e) [0x7ff6d451473e]
[ 1671.398] (EE) 13: /usr/bin/X (0x400000+0x3ad96) [0x43ad96]
[ 1671.398] (EE) 14: /usr/bin/X (0x400000+0x29e8d) [0x429e8d]
[ 1671.398] (EE) 15: /lib64/libc.so.6 (__libc_start_main+0xed) [0x7ff6d525c94d]
[ 1671.398] (EE) 16: /usr/bin/X (0x400000+0x2a1fd) [0x42a1fd]
[ 1671.398] (EE)
[ 1671.398] (EE) Bus error at address 0x7ff6ccbee000
[ 1671.398]
Fatal server error:
[ 1671.398] Caught signal 7 (Bus error). Server aborting
[ 1671.398]
[ 1671.398] (EE)
Please consult the The X.Org Foundation support
  at http://wiki.x.org
 for help.
[ 1671.398] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[ 1671.398] (EE)
[ 1671.398] (II) AIGLX: Suspending AIGLX clients for VT switch
[ 1671.460] Server terminated with error (1). Closing log file.

Revision history for this message
In , agd5f (agd5f) wrote :

I think these are actually several bugs at play here.

1. CPU only has a 256 MB window into vram. If the window is full due to fragmentation or other mapped buffers, we fail. Does disabling hyperz help? set env var R600_HYPERZ=0 (mesa 9.1 or older), or R600_DEBUG=nohyperz (mesa git).

2. running out of GPU accessible memory when migrating buffers. Does setting radeon.gartsize=1024 on the kernel command line in grub help?

Revision history for this message
In , Michel Dänzer (michel-daenzer) wrote :

Eugene and Knut, in the future please always include information about which signal was generated, at least if it's not SIGSEGV (which is the most common).

Revision history for this message
In , Michael Mair-Keimberger (bu9zilla) wrote :

(In reply to comment #12)
> I think these are actually several bugs at play here.
>
> 1. CPU only has a 256 MB window into vram. If the window is full due to
> fragmentation or other mapped buffers, we fail. Does disabling hyperz help?
> set env var R600_HYPERZ=0 (mesa 9.1 or older), or R600_DEBUG=nohyperz (mesa
> git).
>
> 2. running out of GPU accessible memory when migrating buffers. Does
> setting radeon.gartsize=1024 on the kernel command line in grub help?

Just tried both, but nothing changed...

Revision history for this message
In , Eugene Shalygin (eugene-shalygin) wrote :

(In reply to comment #14)
Also tried both suggestions. Both changes together allows me to run `kwin --replace` with active kwin_gles. However, when I connect a second screen, it crashes (signal 7) with the same backtrace. When I toggle desktop effects, it crashes also.

Revision history for this message
In , Jürg Billeter (j-bitron) wrote :

No gnome-shell crashes since I've upgraded to Mesa 9.1.1. Will post an update in a few days whether it remains stable or I was just lucky today.

Revision history for this message
In , D. Hugh Redelmeier (hugh-mimosa) wrote :

I think that I am hitting this bug on Fedora 18 x86-64 with all current updates (as of a couple of days ago).

I'm hitting it in Gnome_shell (

Revision history for this message
In , D. Hugh Redelmeier (hugh-mimosa) wrote :

(Sorry for the previous uncompleted comment. I will continue.)

I'm hitting it in Gnome_shell (https://bugzilla.redhat.com/show_bug.cgi?id=924076).

I'm hitting it in Cinnamon (https://bugzilla.redhat.com/show_bug.cgi?id=926036)

I'm hitting it in KWin (https://bugs.kde.org/show_bug.cgi?id=315089)

In each case, I get a SIGBUS in __memset_sse2, called from r600 code. I'm not used to getting SIGBUSes since I migrated from the PDP-11 :-).

This makes me think that it is a radeon driver bug.

One KDE developer pointed in another direction. Since it is reported as an intel driver bug, I don't think that it is right, but it might be relevant: https://bugs.freedesktop.org/show_bug.cgi?id=58834

I think that I still have crash dumps for each of these. If something from them would be useful, ask soon before I recover the space. But I think that I can regenerate one all too quickly.

Revision history for this message
In , D. Hugh Redelmeier (hugh-mimosa) wrote :

More information about my Fedora 18 x86-64 configuration:

- a single monitor: 2560 x 1600 pixels

- video card: Radeon HD 3600 XT (according to Xorg.0.log)

- Fedora's kernel: kernel-3.8.4-202.fc18.x86_64

- Fedora's mesa:

mesa-debuginfo-9.1-3.fc18.x86_64
mesa-dri-drivers-9.1-3.fc18.x86_64
mesa-dri-filesystem-9.1-3.fc18.x86_64
mesa-libEGL-9.1-3.fc18.x86_64
mesa-libgbm-9.1-3.fc18.x86_64
mesa-libGL-9.1-3.fc18.x86_64
mesa-libglapi-9.1-3.fc18.x86_64
mesa-libGLU-9.0.0-1.fc18.x86_64
mesa-libGLU-debuginfo-9.0.0-1.fc18.x86_64
mesa-libxatracker-9.1-3.fc18.x86_64

I didn't have this problem when running Fedora 17. But I did have a bit of oddness: https://bugzilla.kernel.org/show_bug.cgi?id=49981#c5

Is anything else of interest?

Revision history for this message
In , Muteki-f (muteki-f) wrote :

(In reply to comment #16)
> No gnome-shell crashes since I've upgraded to Mesa 9.1.1. Will post an
> update in a few days whether it remains stable or I was just lucky today.

Observed the same gnome-shell crash problem with Mesa 9.1.1 with kernel 3.8.4-202 also. The way how I reproduce this is to have dual monitors. The left one with resolution 1680x1050 (rotated counterclockwise) and the right with 1920x1200. With this setup, gnome-shell will always crash at initial login. If I don't rotate the left monitor, gnome-shell will not crash.

Revision history for this message
In , Korgens (korgens) wrote :
Download full text (6.3 KiB)

I can also confirm this bug.

Software:
Archlinux standard installation.
Linux 3.8.4-1-ARCH #1 SMP PREEMPT Wed Mar 20 22:10:25 CET 2013 x86_64 GNU/Linux
KDE: 4.10.1 (no widgets, other than the task panel)
Kwin: 4.10.1 (standard OpenGL effects installed)
Org X Server 1.14.0
Release Date: 2013-03-05
X Protocol Version 11, Revision 0
Build Operating System: Linux 3.8.2-1-ARCH x86_64
Kernel command line: BOOT_IMAGE=/boot/vmlinuz-linux root=UUID=xyz ro init=/usr/lib/systemd/systemd
Build Date: 09 March 2013 11:43:05AM
Current version of pixman: 0.28.2
mesa: 9.1.1-1 (standard Arch package)

I also tried a git version from AUR repository (same crashes):
OpenGL vendor string: X.Org
OpenGL renderer string: Gallium 0.4 on AMD RV770
OpenGL version string: 3.0 Mesa 9.2.0 (git-135bb3c)
OpenGL shading language version string: 1.30
Driver: R600G
GPU class: R700
OpenGL version: 3.0
GLSL version: 1.30
Mesa version: 9.2
X server version: 1.14
Linux kernel version: 3.8.4
Direct rendering: yes
Requires strict binding: no
GLSL shaders: yes
Texture NPOT support: yes
Virtual Machine: no
Application::crashHandler() called with signal 7; recent crashes: 1

Xorg.conf: empty

Hardware:
Intel i7 920
ATI RV790 [Radeon HD 4890] 1GB VRAM

Monitor 1: 1920x1200 (DVI) - image as background
Monitor 2: 1920x1080 (HDMI) - image as background
Two virtual desktops configured.

How to reproduce: just log in into KDM. When the standard KDE desktop is displayed the first Kwin crash (caused by r600 crash) has occurred and its errors is already being displayed. Kwin is re-spawned and everything seems to be normal for some minutes. Then it crashes again and then again, and again until kwin and the desktop are dead definitively.

It doesn't seem to matter what applications are being used or if the second monitor is on or off. The crashes happens all the same.

I didn't try to downgrade either the kernel or xorg, but I'm sure those crashes started to happen after the latest kernel upgrade to 3.8.x.

Here is the backtrace, which seems to confirm the general suspicion of a texture-handling problem:

Thread 1 (Thread 0x7fb64bdb6780 (LWP 734)):
[KCrash Handler]
#5 0x00007fb64b59859b in __memset_sse2 () from /usr/lib/libc.so.6
#6 0x00007fb59a42b40a in r600_texture_create_object () from /usr/lib/xorg/modules/dri/r600_dri.so
#7 0x00007fb59a42b6e7 in r600_texture_create () from /usr/lib/xorg/modules/dri/r600_dri.so
#8 0x00007fb59a43fbb5 in dri2_allocate_textures () from /usr/lib/xorg/modules/dri/r600_dri.so
#9 0x00007fb59a43e695 in dri_st_framebuffer_validate () from /usr/lib/xorg/modules/dri/r600_dri.so
#10 0x00007fb59a43e8ce in dri_set_tex_buffer2 () from /usr/lib/xorg/modules/dri/r600_dri.so
#11 0x00007fb64b989343 in KWin::GlxTexture::loadTexture(unsigned long const&, QSize const&, int) () from /usr/lib/libkdeinit4_kwin.so
#12 0x00007fb64b9810f5 in KWin::SceneO...

Read more...

Revision history for this message
In , Ogjerstad (ogjerstad) wrote :
Download full text (3.8 KiB)

I can also confirm this bug on Fedora 18 X86_64, radeon driver, HD5870 card.
I cannot log into the system with two external monitors attached, but with only one external monitor it works fine. This seems to have started to happen after upgrade to mesa-dri-drivers 9.1-1 or 9.1-3 (I did not restart between those updates). Here is my backtrace:
b7b62d4d01e98c8b31d95895cbede393b8b0c6e8 0x8954b __memset_sse2 /lib64/libc.so.6 45eccba94f5fef95c8933974ac014d83e50a9ebb
c9e82f1ca30be88ed6ef697aa581a1c940858291 0x349ff2 r600_texture_create_object /usr/lib64/dri/r600_dri.so 4f51004f888b1ba8194719f476c2fd4ff4aad900
c9e82f1ca30be88ed6ef697aa581a1c940858291 0x34a2c7 r600_texture_create /usr/lib64/dri/r600_dri.so 5baaed899c5ecf15a1568310eff18dc385cdf50d
c9e82f1ca30be88ed6ef697aa581a1c940858291 0x35ca25 dri2_allocate_textures /usr/lib64/dri/r600_dri.so 0a2fa38b05320f36fa5b90de563948aac101910e
c9e82f1ca30be88ed6ef697aa581a1c940858291 0x35b595 dri_st_framebuffer_validate /usr/lib64/dri/r600_dri.so 45eccba94f5fef95c8933974ac014d83e50a9ebb
c9e82f1ca30be88ed6ef697aa581a1c940858291 0x35b7ce dri_set_tex_buffer2 /usr/lib64/dri/r600_dri.so 55f8abe768545f45025c61cca676f0352439722d
67f4899c1db4d10dd1558ccbb7b8fad4e81e3379 0x68702 _cogl_winsys_texture_pixmap_x11_update /lib64/libcogl.so.11 ac79bd8478de58b10ea1b2bbc9d920bfbe437214
67f4899c1db4d10dd1558ccbb7b8fad4e81e3379 0x66104 _cogl_texture_pixmap_x11_update /lib64/libcogl.so.11 96f709cfd3fe851395649feeda7d4dd459e3ca60
67f4899c1db4d10dd1558ccbb7b8fad4e81e3379 0x666ad _cogl_texture_pixmap_x11_get_texture /lib64/libcogl.so.11 6179db327fe5f11fb226532cabb471745b820c8f
67f4899c1db4d10dd1558ccbb7b8fad4e81e3379 0x66709 _cogl_texture_pixmap_x11_get_type /lib64/libcogl.so.11 ad318b56f9bf3c97e02782cfada93e8f37221c59
67f4899c1db4d10dd1558ccbb7b8fad4e81e3379 0x437d9 cogl_pipeline_set_layer_texture /lib64/libcogl.so.11 ba1ae932f4a148b027fd73337e04d05fbae9862e
c043a38bda0d5dc29bb15a1e211273a4ec3dea5d 0x33eb2 set_texture_on_actor /lib64/libmutter.so.0 47169c9a5b07212ae0061e199a81ca87080a2e4c
c043a38bda0d5dc29bb15a1e211273a4ec3dea5d 0x34008 set_texture /lib64/libmutter.so.0 6647294c524f089a0c77389826b0a954c997c085
c043a38bda0d5dc29bb15a1e211273a4ec3dea5d 0x3491f meta_background_actor_update /lib64/libmutter.so.0 184cb323b2f3eaf40fc3b375371bdf1393e6882a
c043a38bda0d5dc29bb15a1e211273a4ec3dea5d 0x3352a meta_compositor_process_event /lib64/libmutter.so.0 5c8e84a4b7a1c62daf4be566f862cf325d9b1926
c043a38bda0d5dc29bb15a1e211273a4ec3dea5d 0x45367 event_callback /lib64/libmutter.so.0 11cf28c10df323150560c808d8bf75e7ed72e0c8
c043a38bda0d5dc29bb15a1e211273a4ec3dea5d 0x8f476 filter_func /lib64/libmutter.so.0 7bfd96ce0e83046b4b4da64b6d99cf1d3ce823d6
1c54a46d2f8de1beadccfa39550b5eef67ab6bc0 0x4a901 gdk_event_apply_filters /lib64/libgdk-3.so.0 a28856cf0bdff41c2ad87be20d4186aeea868a26
1c54a46d2f8de1beadccfa39550b5eef67ab6bc0 0x4aad7 _gdk_x11_display_queue_events /lib64/libgdk-3.so.0 50da1079f5dfea096a6ddfd4caca8af2b76726e4
1c54a46d2f8de1beadccfa39550b5eef67ab6bc0 0x20511 gdk_display_get_event /lib64/libgdk-3.so.0 457cc16f8850789e7b3830330439fcf89b9a418b
1c54a46d2f8de1beadccfa39550b5eef67ab6bc0 0x4a812 gdk_event_source_dispatch /lib64/libgdk-3.so.0 d2...

Read more...

Revision history for this message
In , Ogjerstad (ogjerstad) wrote :

I can confirm now that downgrading mesa-dri-drivers to 9.0.1-1 (as packaged by Fedora) fixes the problem.

Revision history for this message
In , agd5f (agd5f) wrote :

(In reply to comment #23)
> I can confirm now that downgrading mesa-dri-drivers to 9.0.1-1 (as packaged
> by Fedora) fixes the problem.

Can you bisect mesa?

Revision history for this message
In , Ogjerstad (ogjerstad) wrote :

Not without some instructions, I have never done that before.

And, I must admit, I also downgraded llvm-libs, mesa-dri-filesystem and mesa-libxatracker. I think I had to because of dependences.

I guess I will have to build from source in that case?

Revision history for this message
In , agd5f (agd5f) wrote :

*** Bug 61822 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Knut-tidemann (knut-tidemann) wrote :

I've bisected the issue to this commit:

35840ab189595b817fa8b1a1df8cc92474a7c38d st/dri: implement MSAA for GLX/DRI2 framebuffers

The only thing that was recompiled was mesa, with the following options:

    --with-dri-driverdir=/usr/lib/xorg/modules/dri
    --with-gallium-drivers=r600,swrast
    --with-dri-drivers=
    --with-egl-platforms=drm,x11
    --enable-gallium-egl --enable-shared-glapi
    --enable-gallium-llvm
    --enable-glx-tls
    --enable-gles1
    --enable-gles2
    --enable-egl
    --enable-dri
    --enable-glx
    --enable-xa
    --enable-osmesa
    --enable-gbm
    --enable-texture-float

The issue was triggered by restarting kwin (kwin --replace) between each compile and locking and unlocking the screen.

Revision history for this message
In , Xeno-l (xeno-l) wrote :
Download full text (8.5 KiB)

Here happens similar thing. KWin with desktop effects enabled crashes on login to KDE and after resume from suspend with 100% reproductibility.

My config Fedora 18.
 - kernel-PAE-3.8.5-201.fc18.i686
 - libdrm-2.4.42-1.fc18.i686
 - xorg-x11-server-Xorg-1.13.3-2.fc18.i686
 - mesa-dri-drivers-9.1-3.fc18.i686
 - kde 4.10.2-1

Backtrace:
Application: KWin (kwin), signal: Bus error
Using host libthread_db library "/lib/libthread_db.so.1".
[Current thread is 1 (Thread 0xb769c780 (LWP 18403))]

Thread 2 (Thread 0xb1644b40 (LWP 18417)):
#0 0xb7784424 in __kernel_vsyscall ()
#1 0x4145e18c in pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_cond_wait.S:171
#2 0xb2d5c20c in pipe_semaphore_wait (sema=0xb4190b50) at ../../../../../src/gallium/auxiliary/os/os_thread.h:433
#3 radeon_drm_cs_emit_ioctl (param=0xb4170008) at radeon_drm_cs.c:416
#4 0x4145aaff in start_thread (arg=0xb1644b40) at pthread_create.c:308
#5 0x4138a0be in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:132

Thread 1 (Thread 0xb769c780 (LWP 18403)):
[KCrash Handler]
#7 __memset_sse2 () at ../sysdeps/i386/i686/multiarch/memset-sse2.S:331
#8 0xb2d2f186 in r600_texture_create_object (screen=screen@entry=0x9342718, base=base@entry=0xbfc4b1e4, pitch_in_bytes_override=pitch_in_bytes_override@entry=0, buf=buf@entry=0x0, alloc_bo=alloc_bo@entry=1 '\001', surface=surface@entry=0xbfc4a4d8) at r600_texture.c:460
#9 0xb2d2f49c in r600_texture_create (screen=screen@entry=0x9342718, templ=templ@entry=0xbfc4b1e4) at r600_texture.c:550
#10 0xb2d28564 in r600_resource_create (screen=0x9342718, templ=0xbfc4b1e4) at r600_resource.c:37
#11 0xb2d5650f in dri2_drawable_process_buffers (att_count=1, atts=0xbfc4b2a4, buffer_count=1, buffers=0x94cccbc, drawable=<optimized out>) at dri2.c:254
#12 dri2_allocate_textures (drawable=0x94dad88, statts=0xbfc4b2a4, statts_count=1) at dri2.c:404
#13 0xb2d5788c in dri_st_framebuffer_validate (stfbi=0x94dad88, statts=0xbfc4b2a4, count=1, out=0x0) at dri_drawable.c:81
#14 0xb2d57ab7 in dri_drawable_validate_att (statt=ST_ATTACHMENT_FRONT_LEFT, drawable=0x94dad88) at dri_drawable.c:206
#15 dri_set_tex_buffer2 (pDRICtx=0x9310270, target=3553, format=8409, dPriv=0x948c7a0) at dri_drawable.c:220
#16 0xb771b770 in dri2_bind_tex_image (dpy=0x91897e8, drawable=29360181, buffer=8414, attrib_list=0x0) at dri2_glx.c:1004
#17 0xb76efa93 in __glXBindTexImageEXT (dpy=0x91897e8, drawable=drawable@entry=29360181, buffer=buffer@entry=8414, attrib_list=attrib_list@entry=0x0) at glxcmds.c:2370
#18 0x447d5a8c in loadTexture (depth=24, size=..., pix=@0xbfc4b40c: 29360179, this=0x9494f08) at /usr/src/debug/kde-workspace-4.10.2/kwin/glxbackend.cpp:716
#19 KWin::GlxTexture::loadTexture (this=0x9494f08, pix=@0xbfc4b40c: 29360179, size=..., depth=24) at /usr/src/debug/kde-workspace-4.10.2/kwin/glxbackend.cpp:658
#20 0x447ca65a in KWin::SceneOpenGL::Texture::load (this=0x92df3a8, pix=@0xbfc4b40c: 29360179, size=..., depth=24, region=...) at /usr/src/debug/kde-workspace-4.10.2/kwin/scene_opengl.cpp:764
#21 0x447ccaf7 in KWin::SceneOpenGL::Window::bindTexture (this=0x94ccbd0) at /usr/src/debug/kde-workspace-4.10.2/kwin/scene_opengl.cpp:822
#22 0x4...

Read more...

Revision history for this message
In , Marek Olšák (maraeo) wrote :

r600g crashes because it's mapping a MSAA resource in order to clear the CMASK to zeros. The problem is MSAA resources occupy a lot of memory and the system is failing to map the whole resource.

The solution is easy: we should clear the CMASK and HTILE buffers using DMA or streamout and not with the CPU.

Revision history for this message
In , Glisse (glisse) wrote :

Well this is also a kwin bug, kwin should not pick MSAA visual. I fixed cogl so that it does not pick msaa visual for gnome-shell.

Revision history for this message
In , Maxim Egorushkin (max0x7ba) wrote :
Download full text (6.2 KiB)

I observe the same issue on Fedora 18 with Radeon HD 5670:

[KCrash Handler]
#6 __memset_sse2 () at ../sysdeps/x86_64/memset.S:873
#7 0x00007f59a39eaff2 in memset (__len=<optimized out>, __ch=204, __dest=<optimized out>) at /usr/include/bits/string3.h:84
#8 r600_texture_create_object (screen=screen@entry=0x1f317a0, base=base@entry=0x7fff57798810, pitch_in_bytes_override=pitch_in_bytes_override@entry=0, buf=buf@entry=0x0, surface=surface@entry=0x7fff57797b10) at r600_texture.c:509
#9 0x00007f59a39eb2c7 in r600_texture_create (screen=0x1f317a0, templ=0x7fff57798810) at r600_texture.c:601
#10 0x00007f59a39fda25 in dri2_drawable_process_buffers (att_count=1, atts=0x7fff577988f0, buffer_count=1, buffers=0x1e9fe80, drawable=0x1e9ff20) at dri2.c:254
#11 dri2_allocate_textures (drawable=0x1e9ff20, statts=0x7fff577988f0, statts_count=1) at dri2.c:404
#12 0x00007f59a39fc595 in dri_st_framebuffer_validate (stfbi=<optimized out>, statts=0x7fff577988f0, count=1, out=0x0) at dri_drawable.c:81
#13 0x00007f59a39fc7ce in dri_drawable_validate_att (statt=ST_ATTACHMENT_FRONT_LEFT, drawable=0x1e9ff20) at dri_drawable.c:206
#14 dri_set_tex_buffer2 (pDRICtx=<optimized out>, target=3553, format=8409, dPriv=<optimized out>) at dri_drawable.c:220
#15 0x000000304bacb3c3 in loadTexture (depth=24, size=..., pix=<optimized out>, this=0x1ea0b70) at /usr/src/debug/kde-workspace-4.10.1/kwin/glxbackend.cpp:716
#16 KWin::GlxTexture::loadTexture (this=0x1ea0b70, pix=<optimized out>, size=..., depth=24) at /usr/src/debug/kde-workspace-4.10.1/kwin/glxbackend.cpp:658
#17 0x000000304bac30d5 in KWin::SceneOpenGL::Window::bindTexture (this=0x2184e10) at /usr/src/debug/kde-workspace-4.10.1/kwin/scene_opengl.cpp:822
#18 0x000000304bac9a8e in KWin::SceneOpenGL::Window::performPaint (this=this@entry=0x2184e10, mask=mask@entry=1, region=..., data=...) at /usr/src/debug/kde-workspace-4.10.1/kwin/scene_opengl.cpp:931
#19 0x000000304bac249f in KWin::SceneOpenGL2::performPaintWindow (this=this@entry=0x1fc3af0, w=w@entry=0x1ef75c0, mask=mask@entry=1, region=..., data=...) at /usr/src/debug/kde-workspace-4.10.1/kwin/scene_opengl.cpp:566
#20 0x000000304bac263d in KWin::SceneOpenGL2::finalDrawWindow (this=0x1fc3af0, w=w@entry=0x1ef75c0, mask=mask@entry=1, region=..., data=...) at /usr/src/debug/kde-workspace-4.10.1/kwin/scene_opengl.cpp:551
#21 0x000000304bad63a5 in KWin::EffectsHandlerImpl::drawWindow (this=0x2170db0, w=w@entry=0x1ef75c0, mask=mask@entry=1, region=..., data=...) at /usr/src/debug/kde-workspace-4.10.1/kwin/effects.cpp:318
#22 0x000000304bab585a in KWin::Scene::finalPaintWindow (this=<optimized out>, w=0x1ef75c0, mask=1, region=..., data=...) at /usr/src/debug/kde-workspace-4.10.1/kwin/scene.cpp:449
#23 0x000000304bad6627 in KWin::EffectsHandlerImpl::paintWindow (this=0x2170db0, w=0x1ef75c0, mask=mask@entry=1, region=..., data=...) at /usr/src/debug/kde-workspace-4.10.1/kwin/effects.cpp:281
#24 0x000000304bab841d in KWin::Scene::paintWindow (this=<optimized out>, w=0x2184e10, mask=1, region=..., quads=...) at /usr/src/debug/kde-workspace-4.10.1/kwin/scene.cpp:356
#25 0x000000304bab774f in KWin::Scene::paintSimpleScreen (this=this@entry=0x1fc3af0, orig_mask=orig_mask@entry=0, ...

Read more...

Revision history for this message
In , D. Hugh Redelmeier (hugh-mimosa) wrote :

Thanks, Knut, for bisecting in #27. Thanks, Stan, for confirming bisection in #28.

So the bad changeset is http://cgit.freedesktop.org/mesa/mesa/commit/?id=35840ab189595b817fa8b1a1df8cc92474a7c38d

I read that code (out of context: I'm not familiar with Xorg code). It kind of looked as if things with obvious allocation potential were followed by asserts to check that the allocation worked. So why are we observing SIGBUS rather than assertion errors? If allocation failure is possible, even assertion failure seems harsh (but at least more diagnostic).

Revision history for this message
In , agd5f (agd5f) wrote :

(In reply to comment #32)
> Thanks, Knut, for bisecting in #27. Thanks, Stan, for confirming bisection
> in #28.
>
> So the bad changeset is
> http://cgit.freedesktop.org/mesa/mesa/commit/
> ?id=35840ab189595b817fa8b1a1df8cc92474a7c38d
>
> I read that code (out of context: I'm not familiar with Xorg code). It kind
> of looked as if things with obvious allocation potential were followed by
> asserts to check that the allocation worked. So why are we observing SIGBUS
> rather than assertion errors? If allocation failure is possible, even
> assertion failure seems harsh (but at least more diagnostic).

As per comment 29, the MSAA surface is too big to be mapped by the CPU (the CPU's window into VRAM is only 256 MB). The allocation is successful, but the CPU is not able to map the buffer due to the limited window. You get a sigbus because the mapping fails and the CPU tries to access an address beyond the PCI aperture where vram is mapped. The solution is to either disable MSAA or as per comment 29, use the GPU to initialize the CMASK/HTILE buffers rather than using the CPU.

Revision history for this message
In , D. Hugh Redelmeier (hugh-mimosa) wrote :

Thanks, Alex, for the clear restatement.

Naively, I think of two C/UNIX conventions. For allocations that can either succeed or fail, typically the result is a pointer which is NULL for failure -- eg. malloc(3)). For allocations that can partially succeed, the result is the amount successfully processed (think write(2) which returns the length actually transferred).

It seems to me that for allocating address space in the VRAM window (aperture?), success can partial, and anything that deals with that window needs to be aware that accessing some object may require piecewise operations, punctuated by adjustment of the mapping.

In other words, this case doesn't sound pathological; it should be handled as a normal case.

I'm not saying that comment #29 is wrong. I'm saying that the existing code ought to have been written to handle this case. Clearly one solution is to replace the code as suggested. But fixing the code ought to be feasible too. Are there other lurking bugs where code assumes addressability?

I reiterate: I'm not knowledgeable about modern video architectures or about X server architectures.

Footnote: Why do I think that VRAM windows cannot always map the whole VRAM? Because video cards now routinely have gigabytes of VRAM and 32-bit address spaces for x86 machines cannot spare enough to allocate for that much VRAM (ignoring PAE).

Revision history for this message
In , D. Hugh Redelmeier (hugh-mimosa) wrote :

I finally am trying to set the GART size, as per Alex's comment #12.

Without setting it, Xorg0.log reports:
[ 15.054] (II) RADEON(0): mem size init: gart size :1fdef000 vram size: s:20000000 visible:1f020000

The default GART size is 512MiB, You'd think that would be a good match for the 512MiB video RAM on my card. This output shows that the GART size is reduced by a somewhat odd number (13^2 * 2^12).

When I set radeon.gartsize=1024 on the kernel line, Xorg.0.log reports:
 [ 14.842] (II) RADEON(0): mem size init: gart size :3fdef000 vram size: s:20000000 visible:1f020000

Again, the GART size is reduced by the same amount.

Wait: according to http://dri.freedesktop.org/wiki/GART/ the GART seems to enable the video card to access the computer's memory, not the other way around. So I don't see how this is relevant: we're getting a crash in a CPU instruction that is trying to access the video card's memory.

Revision history for this message
In , agd5f (agd5f) wrote :

(In reply to comment #34)
> I'm not saying that comment #29 is wrong. I'm saying that the existing code
> ought to have been written to handle this case. Clearly one solution is to
> replace the code as suggested. But fixing the code ought to be feasible
> too. Are there other lurking bugs where code assumes addressability?
>

It's a bug that needs to be fixed.

Revision history for this message
In , agd5f (agd5f) wrote :

(In reply to comment #35)
>
> Wait: according to http://dri.freedesktop.org/wiki/GART/ the GART seems to
> enable the video card to access the computer's memory, not the other way
> around. So I don't see how this is relevant: we're getting a crash in a CPU
> instruction that is trying to access the video card's memory.

GART and VRAM are separate. As I said in comment 12, I though there may have been two issues at play at first, but upon further debugging, it appears to just be the vram window.

Revision history for this message
In , Niels Ole Salscheider (niels-ole) wrote :

Created attachment 78307
dma_fill function

I considered to implement Marek's proposal by adding a dma_fill function that works analogously to dma_copy.
I need the context to call this function, though, and I do not know how to get it in r600_texture_create_object (nor if I should be able to get it at all).

Revision history for this message
In , Marek Olšák (maraeo) wrote :

BTW today I have implemented buffer clearing using streamout. I'll send my patches to mesa-dev later today. We can use DMA later for selected chipsets. (I assume there'll be some quirks with DMA)

Revision history for this message
In , D. Hugh Redelmeier (hugh-mimosa) wrote :

Thanks, Neils, for the patch in comment #38.

Warning: the following comments are made with no understanding of X code.

"writting" should be "writing"

What is 0x000fffff? I would find the code clearer if this constant were given a name (RADEON_MAX_DMA?).

Why code
  nfill = (size / 0x000fffff) + !!(size % 0x000fffff);
instead of the generally more efficient
  nfill = (size+ 0x000fffff - 1) / 0x000fffff;
My guess: to avoid overflow. But if that is the case, then nfill should be uint64_t (like size), not just unsigned (which might be only 32 bits). Any case where the second statement might have an overflow problem would also be a problem with nfill being only unsigned 32.

If you know size is not 0, this is even better:
  nfill = ((size - 1) / 0x000fffff) + 1;
If size can be 0, it is probably worth an early-out test to avoid bothering the GPU anyway:
  if (size == 0) return 0;

I find code is easier to read if the scope of a variable is minimized. So I'd make the assignment to fsize also its declaration.

Have you created a patch to get these functions called in place of the currently broken code?

Revision history for this message
In , Niels Ole Salscheider (niels-ole) wrote :

As I said before, the dma_fill function is based on dma_copy and I tried to keep the diff small. But you are right, this could be improved.

> Have you created a patch to get these functions called in place of the
> currently broken code?

I did but I was missing the context that is needed to call the dma_fill function.
You can try Marek's patches on the mailing list, they work well for me.

You can try to replace the call to util_blitter_clear_buffer with the dma_fill version if you are interested in it. But I don't think that you will see any noticeable difference (given the assumtion that the dma version works).

Revision history for this message
In , Glisse (glisse) wrote :

As i said in comment 30 it's also a bug of kwin, kwin should not use msaa visual for everything ...

Revision history for this message
In , K6j-fdedria-zp0 (k6j-fdedria-zp0) wrote :

(In reply to comment #42)
> As i said in comment 30 it's also a bug of kwin, kwin should not use msaa
> visual for everything ...

I've pushed a patch to the stable branch in KDE that should fix this issue.

Revision history for this message
In , D. Hugh Redelmeier (hugh-mimosa) wrote :

Jerome Glisse in comment #30:
"Well this is also a kwin bug, kwin should not pick MSAA visual. I fixed cogl so that it does not pick msaa visual for gnome-shell."

Thanks!

I am (even today) experiencing gnome-shell crashes like this on an up-to-date fedora 18 (cogl-1.12.2-1.fc18.x86_64). See my comment #18. Also in Cinnamon (untested since that report).

What version of cogl should I look for? I can add something about this to the Fedora bug report.

Revision history for this message
In , Emil-l-velikov (emil-l-velikov) wrote :

(In reply to comment #44)
> What version of cogl should I look for? I can add something about this to
> the Fedora bug report.

commit: 93b7b4c850dd928bf21ee168a95641a8d631f713
Author: Jerome Glisse <email address hidden>

    glx do not use multisample visual config for front or pixmap

    There is no guaranty that glXGetFBConfigs will return fbconfig ordered
    with non msaa config first. This patch make sure that non msaa config
    get choose.

Present only in the master branch. Mind you I'm not a dev, but I've sent a request[1] to pull it the 1.12, 1.14 branches. Hope they will pick it up soon

Emil

[1] http://lists.freedesktop.org/archives/cogl/2013-April/001090.html

Revision history for this message
In , Bugs-xorg (bugs-xorg) wrote :

(In reply to comment #43)
> (In reply to comment #42)
> > As i said in comment 30 it's also a bug of kwin, kwin should not use msaa
> > visual for everything ...
>
> I've pushed a patch to the stable branch in KDE that should fix this issue.

Could you post a link to that patch?
Thank you.

Revision history for this message
In , K6j-fdedria-zp0 (k6j-fdedria-zp0) wrote :

(In reply to comment #46)
> (In reply to comment #43)
> > I've pushed a patch to the stable branch in KDE that should fix this issue.
>
> Could you post a link to that patch?

http://quickgit.kde.org/?p=kde-workspace.git&a=commit&h=a021eacf

Revision history for this message
In , Marco Trevisan (Treviño) (3v1n0) wrote :

Noticed the same issue in unity (compiz, actually) as soon as the opengl plugin is loaded.

It seems to happen mostly in multi-monitor: http://paste.ubuntu.com/5661055/

Revision history for this message
In , agd5f (agd5f) wrote :

Should be fixed in mesa master and 9.1 branch with the following commit:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=b69207642079fe8ba33c594750415e8d9c66a06f

Revision history for this message
In , Marco Trevisan (Treviño) (3v1n0) wrote :

(In reply to comment #48)
> Noticed the same issue in unity (compiz, actually) as soon as the opengl
> plugin is loaded.
>
> It seems to happen mostly in multi-monitor: http://paste.ubuntu.com/5661055/

FYI, Compiz fix is in https://code.launchpad.net/~3v1n0/compiz/msaa-configs-ignore/+merge/163550 and tracked by bug https://bugs.launchpad.net/compiz/+bug/1174495

Changed in mesa (Ubuntu):
status: New → Confirmed
Changed in mesa:
importance: Unknown → Medium
status: Unknown → Fix Released
Robert Hooker (sarvatt)
summary: - r600_dri.so crashes in r600_texture_create_object (memset)
+ (needs 9.1.3) r600_dri.so crashes in r600_texture_create_object (memset)
Changed in mesa (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
In , Nikoli (nikoli) wrote :

This bug is not fixed, still happens with kwin-4.10.4 and mesa-9.1.4
https://bugs.kde.org/show_bug.cgi?id=322773
https://bugs.gentoo.org/show_bug.cgi?id=476606

Revision history for this message
In , agd5f (agd5f) wrote :

(In reply to comment #51)
> This bug is not fixed, still happens with kwin-4.10.4 and mesa-9.1.4
> https://bugs.kde.org/show_bug.cgi?id=322773
> https://bugs.gentoo.org/show_bug.cgi?id=476606

I think you are hitting a different issue. Please open a new bug.

Revision history for this message
In , Nikoli (nikoli) wrote :
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

fixed some time ago

Changed in mesa (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.