compiz exits with radeon error

Bug #550850 reported by Martin Garton
62
This bug affects 10 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Undecided
Unassigned
Nominated for Lucid by John Gilmore
Nominated for Maverick by John Gilmore

Bug Description

Binary package hint: xorg

compiz runs fine initially but eventually reports:

drmRadeonCmdBuffer: -12. Kernel failed to parse or rejected command stream. See dmesg for more info.

and then exits.

dmesg reveals:

[ 1217.412221] [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation !

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: xorg 1:7.5+3ubuntu1
ProcVersionSignature: Ubuntu 2.6.32-17.26-generic 2.6.32.10+drm33.1
Uname: Linux 2.6.32-17-generic i686
Architecture: i386
Date: Mon Mar 29 12:57:10 2010
DkmsStatus: Error: [Errno 2] No such file or directory
InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Beta i386 (20100318)
Lsusb:
 Bus 001 Device 002: ID 050d:705c Belkin Components 802.11bg
 Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: IBM 2684HVG
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-17-generic root=UUID=820ba7e9-9b9d-43bd-9a55-6184a222354c ro quiet splash
ProcEnviron:
 LANG=en_GB.utf8
 SHELL=/bin/bash
SourcePackage: xorg
Symptom: display
dmi.bios.date: 03/22/2005
dmi.bios.vendor: IBM
dmi.bios.version: 1SET68WW (1.36 )
dmi.board.name: 2684HVG
dmi.board.vendor: IBM
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: IBM
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnIBM:bvr1SET68WW(1.36):bd03/22/2005:svnIBM:pn2684HVG:pvrNotAvailable:rvnIBM:rn2684HVG:rvrNotAvailable:cvnIBM:ct10:cvrNotAvailable:
dmi.product.name: 2684HVG
dmi.product.version: Not Available
dmi.sys.vendor: IBM
glxinfo: Error: [Errno 2] No such file or directory
system:
 distro: Ubuntu
 codename: lucid
 architecture: i686
 kernel: 2.6.32-17-generic

Revision history for this message
Martin Garton (garton) wrote :
Bryce Harrington (bryce)
affects: xorg (Ubuntu) → xserver-xorg-video-ati (Ubuntu)
Bryce Harrington (bryce)
Changed in xserver-xorg-video-ati (Ubuntu):
status: New → Confirmed
Revision history for this message
Bryce Harrington (bryce) wrote :

Hrm, none of the attachments reveal a backtrace or error message other than the relocation error you mentioned. However, I suspect the Failed to parse relocation message is a warning message more than a distinct error; it's something we've seen before on other bugs like #513011 and I think it just means that hardware rendering didn't work or some such. It's not enough information for us to troubleshoot this bug. Try to gather some more data, such as by running compiz from the terminal and collecting all of its output. Maybe dmesg or syslog could provide more info. I think there's also some debugging flags that can be set in the kernel to get more info.

Anyway, the one thing that is clear is that something went wrong in the kernel drm code, so at least we can refile it to the kernel.

affects: xserver-xorg-video-ati (Ubuntu) → linux (Ubuntu)
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Martin Garton (garton) wrote :

Running compiz from the terminal is what led to the message:

drmRadeonCmdBuffer: -12. Kernel failed to parse or rejected command stream. See dmesg for more info.

.. that I previously mentioned. I have since tried again with "compiz --debug" but no extra information is revealed.

I am trying to find out how I might supply debugging flags to the kernel, and will provide another update once I figure that out.

Thanks for your help so far, Bryce.

Revision history for this message
Martin Garton (garton) wrote :

Reproducing this with drm.debug=1 produced more output. Here is what I hope is the relevant part from dmesg:

Mar 31 22:38:46 martin-laptop kernel: [ 851.833384] [drm:drm_ioctl], pid=829, cmd=0xc0086464, nr=0x64, dev 0xe200, auth=1
Mar 31 22:38:46 martin-laptop kernel: [ 851.836186] [drm:drm_ioctl], pid=1619, cmd=0xc0086464, nr=0x64, dev 0xe200, auth=1
Mar 31 22:38:46 martin-laptop kernel: [ 851.836219] [drm:drm_ioctl], pid=1619, cmd=0xc0086464, nr=0x64, dev 0xe200, auth=1
Mar 31 22:38:46 martin-laptop kernel: [ 851.838215] [drm:drm_ioctl], pid=1619, cmd=0xc0086464, nr=0x64, dev 0xe200, auth=1
Mar 31 22:38:46 martin-laptop kernel: [ 851.841233] [drm:drm_ioctl], pid=829, cmd=0xc01c64a3, nr=0xa3, dev 0xe200, auth=1
Mar 31 22:38:46 martin-laptop kernel: [ 851.841807] [drm:drm_ioctl], pid=1619, cmd=0xc010640b, nr=0x0b, dev 0xe200, auth=1
Mar 31 22:38:46 martin-laptop kernel: [ 851.841837] [drm:drm_ioctl], pid=1619, cmd=0xc00c6469, nr=0x69, dev 0xe200, auth=1
Mar 31 22:38:46 martin-laptop kernel: [ 851.841845] [drm:radeon_gem_get_tiling_ioctl],
Mar 31 22:38:46 martin-laptop kernel: [ 851.841859] [drm:drm_ioctl], pid=1619, cmd=0xc010640b, nr=0x0b, dev 0xe200, auth=1
Mar 31 22:38:46 martin-laptop kernel: [ 851.841867] [drm:drm_ioctl], pid=1619, cmd=0xc00c6469, nr=0x69, dev 0xe200, auth=1
Mar 31 22:38:46 martin-laptop kernel: [ 851.841873] [drm:radeon_gem_get_tiling_ioctl],
Mar 31 22:38:46 martin-laptop kernel: [ 851.841882] [drm:drm_ioctl], pid=1619, cmd=0x40086409, nr=0x09, dev 0xe200, auth=1
Mar 31 22:38:46 martin-laptop kernel: [ 851.841893] [drm:drm_ioctl], pid=1619, cmd=0x40086409, nr=0x09, dev 0xe200, auth=1
Mar 31 22:38:46 martin-laptop kernel: [ 851.842360] [drm:drm_ioctl], pid=1619, cmd=0xc008646a, nr=0x6a, dev 0xe200, auth=1
Mar 31 22:38:46 martin-laptop kernel: [ 851.842394] [drm:drm_ioctl], pid=1619, cmd=0xc0206466, nr=0x66, dev 0xe200, auth=1
Mar 31 22:38:46 martin-laptop kernel: [ 851.842442] [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation !

Does that help? Is there anything else I can do to provide more information?

Revision history for this message
Martin Garton (garton) wrote :

I decided to run strace on compiz, and perhaps this is a clue. Here is an excerpt from the strace output just before the error:

8346 ioctl(5, 0xc010640b, 0xbfe94660) = 0
8346 ioctl(5, 0xc00c6469, 0xbfe94684) = 0
8346 ioctl(5, 0xc010640b, 0xbfe94660) = 0
8346 ioctl(5, 0xc00c6469, 0xbfe94684) = 0
8346 ioctl(5, 0xc010640b, 0xbfe94660) = 0
8346 ioctl(5, 0xc00c6469, 0xbfe94684) = 0
8346 ioctl(5, 0x40086409, 0xbfe94728) = 0
8346 ioctl(5, 0x40086409, 0xbfe94728) = 0
8346 ioctl(5, 0xc0206466, 0xa1d4174) = -1 ENOMEM (Cannot allocate memory)
8346 write(2, "drmRadeonCmdBuffer: -12. Kernel "..., 101) = 101

So immediately before the error I am getting ENOMEM returned. FYI, fd 5 is /dev/dri/card0

Revision history for this message
Martin Garton (garton) wrote :

This is still marked as incomplete. Can anyone let me know what else is needed before it is "complete" again? I don't know what other information is needed.

Martin Garton (garton)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
John Gilmore (gnu-gilmore) wrote :

This is also happening for me since I upgraded my HP dc5750 PC to Ubuntu 10.04. The desktop runs fine for a few days, then I start seeing odd "static" in windows, including lots of horizontal lines in parts of Firefox windows, plus sometimes, garbled fonts in some firefox windows. Then without warning the window manager goes away, leaving me a bunch of undecorated windows.

I eventually figured out (by comparing ps listings before and after) that when that happens, compiz died. When I'm in that state (as my desktop is right now as I type this), I can run compiz from a terminal window, and it reports:

  WARNING: Application calling GLX 1.3 function "glXCreatePixmap" when GLX 1.3 is not supported! This is an application bug!
  drmRadeonCmdBuffer: -12. Kernel failed to parse or rejected command stream. See dmesg for more info.

  [1]+ Exit 244 /usr/bin/compiz

and almost immediately terminates. (It starts to decorate windows, etc, then it all gets torn down again.)

dmesg reports:

  [356951.921231] [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!

a few times when that happens.

It's not clear whether compiz is building a parameter to drm wrongly, or whether drm is wrongly rejecting a valid parameter from compiz. But what IS clear is that restarting compiz once the system gets into this state doesn't do any good. However, logging out (using the menu item at the upper right corner of the screen) and logging back in DOES eliminate the problem (for a few days). And a power cycle also eliminates it.

What further info can we get you to help debug it? It happens pretty regularly, and when it does, I can usually type into a terminal and/or a web browser, so there's lots that could be done. (Sometimes it refuses to change the input focus while in this state; I haven't figured out why or how to work around that.)

Revision history for this message
John Gilmore (gnu-gilmore) wrote :

I had originally marked bug #583891 as a duplicate of this. Someone else detached it, arguing that it may not be the same bug because the X server or other processes are crashing with the "Failed to parse relocation -12" message, rather than compiz. This makes me suspect a bug in the kernel DRM driver, rather than in compiz or the other crashing programs.

Perhaps these are the same bug, perhaps different. I think that whoever looks at either bug should know about the existence of the other, though, hence this comment.

Revision history for this message
John Gilmore (gnu-gilmore) wrote :

Here's a screenshot of sample screen corruption. This comes and goes (making the program repaint the window often fixes it). Notice:

  * Mangled fonts, e.g. in right-hand side of screen "Change your plan or services" mangles the "g", and it's also mangled
above there in "messages you've used". Also mangled in the text boxes, though it's harder to tell since they got overlaid with upside-down stuff.

  * "Welcome back!" box was rendered upside-down, along with "Log in to My T-Mobile".

  * Horizontal bar of colored "static" near bottom of window. Usually when I see this static, it's not so colorful, more like diagonal lines of black & white. See also the steeper diagonal line of black & white cutting through the bottom blue bar.

Usually by the time it starts mangling fonts, it's getting close to crashing compiz within the next hour. The horizontal lines of "static" come and go fairly frequently, for any window that's updating on the screen, and don't seem to lead to immediate crashes.

Revision history for this message
John Gilmore (gnu-gilmore) wrote :

I've just found a way to reproduce the issue on my system.

I ran "evince" (the PDF viewer) on a file someone sent me, which produces a very wide window that fills the entire screen.
The first time I ran this, compiz immediately died after this window went up. So I investigated. I logged out and back in,
tried just running evince on this file. No crash. So logged out and back in, started four terminal windows (my usual), then
evince, no crash. So started Firefox with my usual sixty or seventy tabs, then ran evince. No crash. But when I moved that window off the lower right corner of the screen (so that about 40% of it was on the screen, the rest offscreen), bam, compiz
failed.

I rebooted the system, logged in, started four terminals, checked dmesg (no "relocation" messages), started Firefox, ran evince on this file. No crash. Used the middle
button in the window frame top to move it around and generally offscreen in the lower right corner. No crash. Let it sit there
for a few seconds after releasing the middle button. Bam, crash! And there in dmesg were seven messages:

[ 525.385000] [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!

[ 525.824032] [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!

[ 526.129675] [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!

[ 531.537754] [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!

[ 536.836914] [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!

[ 536.872722] [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!

[ 536.940502] [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!

I've attached the PDF file that triggers this behavior.

This is on an HP dc5750 (AMD dualcore desktop, 5GB RAM, with Radeon graphics attached to 1980x1200 HP LP2465 screen). I'll attach dmesg | egrep "drm|radeon" output and X log.

Hmm, there is an *ERROR* in dmesg about forcing to 32M GART size (because of ASIC bug ?) -- could that be related?

Revision history for this message
John Gilmore (gnu-gilmore) wrote :
Revision history for this message
John Gilmore (gnu-gilmore) wrote :
Revision history for this message
Stephen Gornick (sgornick) wrote :

As a workaround until this is resolved, I use a script:

$ cat bin/fixcompiz
#! /bin/sh
export DISPLAY=:0.0
/usr/bin/xhost local:$USER
/usr/bin/nohup /usr/bin/metacity --replace > /dev/null &

So when I get the error, I go to a tty (e.g., ctrl-alt-F1) and run the script from there. The downside is that all of my open apps are now in workspace #1 and I need to move them if I want them back in the workspaces they occupied pre-crash.

Revision history for this message
luktay (luktay) wrote :

After reading the comments I feel this is related to https://bugs.launchpad.net/ubuntu/+source/linux/+bug/583891

I also recover by running /usr/bin/metacity --replace from an icon I have placed in the panel.

I have tried using latest kernel 2.6.35-17-generic
from sudo add-apt-repository ppa:kernel-ppa/pre-proposed

but still get this problem

[drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!

Revision history for this message
penalvch (penalvch) wrote :

Martin Garton, thank you for reporting this and helping make Ubuntu better. This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux <replace-with-bug-number>

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.