screen corruption and segfaults with radeon m6 ly

Bug #421842 reported by Cedders
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
xserver-xorg-video-ati (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

Binary package hint: xorg

After an upgrade from 9.04 to 9.10 (which was interrupted as a result of bug 404331), several new problems have appeared. Font rendering in Abiword shows a grid pattern which alters as each character is added to text; certain applications and windows (reliably System Monitor and the NetworkManager 'connected' pop-up) instead of ordinary contents display a pattern that I can best describe as tartan (although the attached images aren't in tartan colours); with DRI on, when a window goes out of focus is it often corrupted; and (also only when DRI is disabled) when creating a large window (e.g. for Thunderbird), Xorg segfaults, part of the left hand of the screen appears on the right and I am returned to the GDM greeter.

Nevertheless, this problem does not occur when running from Karmic live CD alpha-4. (see 404331 for additional info). (Since these problems are new in Karmic, and the upgrade was not at all smooth and required some recovery, I suppose they could be the result of corrupt packages.)

ProblemType: Bug
Architecture: i386
CurrentDmesg:
 [ 119.832047] Clocksource tsc unstable (delta = -156359368 ns)
 [ 134.069580] ndiswrapper (iw_set_auth:1585): invalid cmd 12
 [ 135.005185] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
 [ 145.624053] wlan0: no IPv6 routers present
Date: Mon Aug 31 10:13:51 2009
DistroRelease: Ubuntu 9.10
Lsusb:
 Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: Viglen LX
Package: xorg 1:7.4+3ubuntu5
PccardctlIdent:
 Socket 0:
   no product info available
 Socket 1:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
 Socket 1:
   3.3V 32-bit PC Card
ProcCmdLine: root=UUID=8e3337bc-42ea-4a6f-868e-0c3859558c80 ro quiet splash
ProcEnviron:
 PATH=(custom, user)
 LANG=en_GB.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.28-15.49-generic
RelatedPackageVersions:
 xserver-xorg 1:7.4+3ubuntu5
 libgl1-mesa-glx 7.6.0~git20090817.7c422387-0ubuntu3
 libdrm2 2.4.12+git20090801.45078630-0ubuntu1
 xserver-xorg-video-intel 2:2.8.1-1ubuntu1
 xserver-xorg-video-ati 1:6.12.99+git20090825.fc74e119-0ubuntu1
SourcePackage: xorg
Uname: Linux 2.6.28-15-generic i686
dmi.bios.date: 08/07/02
dmi.bios.vendor: Phoenix Technologies LTD
dmi.bios.version: R1.16
dmi.board.name: Almador System CLEVO:4200-000
dmi.board.vendor: CLEVO Corporation
dmi.board.version: None
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 1
dmi.chassis.vendor: No Enclosure
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnPhoenixTechnologiesLTD:bvrR1.16:bd08/07/02:svnViglen:pnLX:pvr0090F503C297:rvnCLEVOCorporation:rnAlmadorSystemCLEVO4200-000:rvrNone:cvnNoEnclosure:ct1:cvrN/A:
dmi.product.name: LX
dmi.product.version: 00:90:F5:03:C2:97
dmi.sys.vendor: Viglen
fglrx: Not loaded
system:
 distro: Ubuntu
 architecture: i686kernel: 2.6.28-15-generic

Revision history for this message
Cedders (cedric-gn) wrote :
Revision history for this message
Cedders (cedric-gn) wrote :
Revision history for this message
Cedders (cedric-gn) wrote :
Revision history for this message
Cedders (cedric-gn) wrote :
Bryce Harrington (bryce)
affects: xorg (Ubuntu) → xserver-xorg-video-ati (Ubuntu)
Bryce Harrington (bryce)
Changed in xserver-xorg-video-ati (Ubuntu):
status: New → Confirmed
Bryce Harrington (bryce)
tags: added: karmic
summary: - [karmic] screen corruption and segfaults with radeon m6 ly
+ screen corruption and segfaults with radeon m6 ly
Revision history for this message
Bryce Harrington (bryce) wrote :

Well, we probably should start by ruling out the not-smooth upgrade. Would you mind booting the alpha-4 livecd again, updating it to latest karmic, and restarting X (log out / log in should do it)? Or alternatively, alpha-5 will be out in a couple days if you'd like to wait and test that.

If you find you can't reproduce it like that, then the next question would be to see if some packages got messed up. You could do this by purging and reinstalling mesa, xorg-server, and -ati. See https://wiki.ubuntu.com/X/Troubleshooting/FglrxInteferesWithRadeonDriver for an example of the commands to do this (even though you didn't have -fglrx installed, the procedure for resetting -ati should be similar.)

Changed in xserver-xorg-video-ati (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Cedders (cedric-gn) wrote :

I don't have enough memory (640M) to update Live CD to latest, but have now tried with alpha-5. There are reproducible screen problems with alpha-5 (e.g. hang after resizing a window), where everything seems fine in alpha-4. I have not seen the screen corruption in alpha-4 either, but attach an example from alpha-5.

* alpha-4 live CD - fine
* alpha-5 live CD - screen corruption and hangs (apparently doing something with dd and rsyslogd)
* current HD Karmic (with DRI off) - screen corruption
* current HD Karmic (with DRI on) - screen corruption and hangs

On the hard drive, in fact I had installed and uninstalled the fglrx driver earlier on, under Hardy, as an attempt to fix the lockups in bug 404331, but not since upgrading to intrepid. I followed instructions at https://wiki.ubuntu.com/X/Troubleshooting/FglrxInteferesWithRadeonDriver and there are no packages, but a few files related to fglrx. I tried both methods to reinstall with no effect on the 'tartan' corruption, and in fact went further in purging libgl1-mesa-glx libgl1-mesa-dri and xserver-xorg-core, and then reinstalling ubuntu-desktop. Anything other packages it might be worth purging? hald? BTW on that wiki page, I think

  sudo apt-get remove --purge fglrx*

should be

  sudo apt-get remove --purge 'fglrx*'

Revision history for this message
Cedders (cedric-gn) wrote :
Revision history for this message
Cedders (cedric-gn) wrote :

On Live CD alpha-5 dmesg is full of

[ 1044.070800] [drm:radeon_cp_start] *ERROR* radeon_cp_start called without lock held, held -2147483648 owner e3d01240 e3d01b40
[ 1044.070834] [drm:radeon_cp_idle] *ERROR* radeon_cp_idle called without lock held, held -2147483648 owner e3d01240 e3d01b40
...
[ 1047.360532] [drm:radeon_cp_idle] *ERROR* radeon_cp_idle called without lock held, held -2147483648 owner e3d01240 e3d01b40
[ 1047.371003] [drm:radeon_cp_reset] *ERROR* radeon_cp_reset called without lock held, held -2147483648 owner e3d01240 e3d01b40
[ 1047.371062] [drm:radeon_cp_start] *ERROR* radeon_cp_start called without lock held, held -2147483648 owner e3d01240 e3d01b40
[ 1047.371097] [drm:radeon_cp_idle] *ERROR* radeon_cp_idle called without lock held, held -2147483648 owner e3d01240 e3d01b40

Also attaching a ps from alpha-5 live CD showing the 'dd' command associated with the hang. Alt+SysRq+K didn't work, nor gdm restart, but Alt+SysRq+B did reboot.

Revision history for this message
Cedders (cedric-gn) wrote :

Some more information from M6 LY world...

1) Starting from Karmic alpha-4 (LiveCD), which works, I can upgrade xserver-xorg-video-radeon to the karmic repository version, and xserver-xorg-core and all the -video packages, compiz, and hal/libhal, and finally install libdrm-radeon1 to get DRI functionality back (otherwise it tends to fall back on vesa). I can't upgrade libgl1-mesa-dri because of bug 420617. After all this, everything still seems to work fine, compiz effects and all. This makes me suspect libgl1-mesa-dri (or maybe the window-decorator?).

2) With the latest updates, the 'tartan' screen corruption still occurs if and only if DRI is off. By latest, I mean
libgl1-mesa-dri/karmic uptodate 7.6.0~git20090817.7c422387-0ubuntu3
ubuntu-desktop/karmic uptodate 1.167

I'm not sure the best way to test this on the live CD - unload the drm module?

3) With the latest updates and if and only if DRI is on, I still get hangs/crashes where either (a) with NoTrapSignals left as false, I get reams of error messages like the above 'radeon_cp_start called without lock held', and a lockup where the cursor is stationary, and I can't attach a gdb. (b) with NoTrapSignals false, a crash in Xorg as follows:

Sep 7 20:41:23 laptop23 kernel: [ 213.306512] Xorg[3115]: segfault at 14 ip 00f740c5 sp bfd247c0 error 4 in radeon_dri.so[f24000+255000]
Sep 7 20:41:25 laptop23 kernel: [ 215.127370] agpgart-intel 0000:00:00.0: AGP 2.0 bridge
Sep 7 20:41:25 laptop23 kernel: [ 215.127401] agpgart-intel 0000:00:00.0: putting AGP V2 device into 4x mode
Sep 7 20:41:25 laptop23 kernel: [ 215.127433] pci 0000:01:00.0: putting AGP V2 device into 4x mode
Sep 7 20:41:25 laptop23 kernel: [ 215.376447] [drm] Setting GART location based on new memory map
Sep 7 20:41:25 laptop23 kernel: [ 215.376467] [drm] Loading R100 Microcode
Sep 7 20:41:25 laptop23 kernel: [ 215.376513] [drm] writeback test succeeded in 1 usecs
Sep 7 20:43:29 laptop23 kernel: [ 339.623068] Xorg[5247]: segfault at 14 ip 00f200c5 sp bfb98040 error 4 in radeon_dri.so[ed0000+255000]

Would a core dump be useful?

Also under the same conditions, I get the partial distortion of a window as shown in the 517K PNG. The easiest way to produce this is to drag a window over a window that does not have the focus.

The segfault (or hang) is most easily produced by enlarging a window that is about 600w x 400h until it is say 1100w x 700h, or starting an application that would produce a larger window.

4) Ctrl-Alt-F1 isn't changing to text mode properly with the very latest updates.

5) I won't be able to confirm bug 404331 is fixed for me until this one is fixed, since if DRI is enabled, this bug happens more frequently.

Revision history for this message
Cedders (cedric-gn) wrote :
Revision history for this message
Cedders (cedric-gn) wrote :

Can reproduce all these bugs on latest updates and alpha-5 live CD, but not alpha-4. To reproduce 'tartan' effect with alpha-5, uninstall compiz-gnome, logout, and run GNOME System Monitor (also seems to occur when there is no window manager, so don't think it's a metacity problem).

Compiz Segfault apparently depends on area of window, not height or width. Best workaround remains Option "DRI" "false".

Changed in xserver-xorg-video-ati (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Cedders (cedric-gn) wrote :

It looks like 2 or 3 separate bugs all introduced into radeon_drv.so since Jaunty. From latest karmic, if I do

  sudo dpkg -i --force-downgrade xserver-xorg-video-radeon_6.12.1-0ubuntu2_i386.deb

everything is fine. Compiz effects are a bit slow, but I don't get segfaults or any screen corruption. So I tracked through the git versions...

1) A small amount of screen corruption around the window borders (with DRI and compiz visual effects) seems to have been introduced between xf86-video-ati-6.12.1 and xf86-video-ati-6.12.2. (I think the console/mode switching problems may have started during this period too, but were later fixed.)

2) The segfault was introduced between xf86-video-ati-6.12.2 and xf86-video-ati-6.12.3 in commit 3c348091ae5d88ef3cb850889ca74674e0530b4e (see attached) The conditions to produce this involve resizing a window or doing an operation larger than particular dimensions (with DRI and compiz visual effects), and it looks like the bad commit sets some dimension maxima too optimistically. Changing a line to
    xf86CrtcSetSizeRange (pScrn, 320, 200, 2560, 1200);
stops the segfault, but I guess problems may go deeper than just that line, because it shouldn't segfault at all. This bug is also associated with screen corruption over large areas when a window with focus is dragged over one without focus.

3) The "tartan" patterns and font rendering problems (which only occur *without* DRI and compiz visual effects) are not present in xf86-video-ati-6.12.4, but are in the master head. I don't know how to locate these to an exact commit because of a merge and some compilation problems.

Please let me know if I should do anything else, like report these upstream. I'd suggest this bug (421842) concentrates on the segfault (number 2 above), and I could open another ubuntu bug for number 3). Hope this isn't too confusing.

Revision history for this message
Luca Aluffi (aluffilu) wrote :

Same problem here, with no effects activated, on a Radeon M6 LY. Anyway I just have 16mb of memory, so I can't activate them.
It happens with:
1) Network manager's notification panel in gnome;
2) Task bar, menu and windows' borders in kde.

Obviously, kde is almost unusable, so I've uninstalled kubuntu in favour of ubuntu

Revision history for this message
je (3-6) wrote :

I experienced the same problem (general screen corruption and system monitor) also with Ubuntu Karmic alpha 5 and Karmic beta (live usb) on my Thinkpad x32 (ATI M6 LY 16MB).
Didn't notice the font corruption after disabling visual effects though.

Revision history for this message
Luca Aluffi (aluffilu) wrote :

Uhm... I can't enable effects because my old laptop has a 1400x1050 screen and I simply don't have enough memory to do it.

The most annoying thing is that it happens in everythings belongs to the notification area/system, which makes hard understand what the computer tries to tell you.

Revision history for this message
Luca Aluffi (aluffilu) wrote :

Found new clue: if I generate a xorg.conf then set option "accelmethod" "EXA", the "tartan" effect goes away and everything looks good. On the other hand performances are poorer, even in firefox.

Maybe the trouble has to be in XAA method, which seems to be changed from jaunty to karmic.

Revision history for this message
Cedders (cedric-gn) wrote :

My problems 1-3 above still exist with

ubuntu-desktop/unknown uptodate 1.174
xserver-xorg-video-ati/unknown uptodate 1:6.12.99+git20090929.7968e1fb-0ubuntu1
xserver-xorg-video-radeon/unknown uptodate 1:6.12.99+git20090929.7968e1fb-0ubuntu1

(and also sometimes console switching with Ctrl+Alt+F1 still doesn't change to text mode.)

As Luca says, problem 3 (tartan screen corruption and font problems with visual effects off) disappears with
   Option "AccelMethod" "EXA"
and in fact so do problems 1-2 (crashing and general screen corruption which only occur with Compiz effects enabled).

je - if you have time, could you try typing in Abiword with visual effects off?

Revision history for this message
je (3-6) wrote :

Hi

just tested Karmic RC (live USB) on my Thinkpad x32 (ATI M6 LY).

With visual effects activated (default), artifacts appear quickly fill the screen, which become unusable (as before). Notifications aren't visible (white background with light artifacts) and the system monitor is as mentionned before, tartan style.

Without visual effects, there aren't any artifact except for the notifications/system monitor, which problem remains.

Conclusion: nothing new.

By the way, I tested Abiword without visual effects and typed some text in it and didn't see any corruption.

Je

Revision history for this message
cyphax (r-elemans2) wrote :

I'm experiencing the same issues on the same hardware (Radeon ML6/7000) and changing from XAA to EXA indeed solves the screen corruption, and degrades performance (noticable when dragging windows for example). I noticed the screen corruption with the system monitor, compiz, and notifications but also with Gnometris. Don't those applications all use Cairo or is it coincidental? Other applications such as Evolution or Firefox showed no artifacts at all, but something "simple" as Gnometris did.

Revision history for this message
Raqua (raquacontact) wrote :

I suffer the same bug. After some googling and experimenting, this xorg.conf works best for me. Not 100% but quite acceptable.

Revision history for this message
Raqua (raquacontact) wrote :
Revision history for this message
Randall Ross (randall) wrote :

Raqua: Does your xorg.conf (above) allow Desktop Effects to be turned on satisfactorily?

Revision history for this message
Raqua (raquacontact) wrote :

Randall Ross: Not really, I turned them off, but at least no artifacts etc. But performance drop is significant, when watching video, my CPU is maxed.

Revision history for this message
Luca Aluffi (aluffilu) wrote :

Ok: we are officially adrift...

With some of last karmic's updates (proposed and backports flagged on) we have 2 choices:
1) With EXA mode there is no more we to watch videos, but notifies are working like a charm;
2) With XAA video works but tartan is back...

I wish there could be the third choice... :(

It does not depend upon kernel version: I've tried yet.

Bryce Harrington (bryce)
tags: added: corruption
Revision history for this message
Bryce Harrington (bryce) wrote :

[This is an automatic notification.]

Hi Cedders,

This bug was reported against an earlier version of Ubuntu, can you
test if it still occurs on Lucid?

Please note we also provide technical support for older versions of
Ubuntu, but not in the bug tracker. Instead, to raise the issue through
normal support channels, please see:

    http://www.ubuntu.com/support

If you are the original reporter and can still reproduce the issue on
Lucid, please run the following command to refresh the report:

  apport-collect 421842

If you are not the original reporter, please file a new bug report, so
we can work with you as the original reporter instead (you can reference
bug 421842 in your report if you think it may be related):

  ubuntu-bug xorg

If by chance you can no longer reproduce the issue on Lucid or if you
feel it is no longer relevant, please mark the bug report 'Fix Released'
or 'Invalid' as appropriate, at the following URL:

  https://bugs.launchpad.net/ubuntu/+bug/421842

Changed in xserver-xorg-video-ati (Ubuntu):
status: Confirmed → Incomplete
tags: added: needs-retested-on-lucid-by-june
Bryce Harrington (bryce)
tags: added: jaunty
Revision history for this message
Bryce Harrington (bryce) wrote :

We're closing this bug since it is has been some time with no response from the original reporter. However, if the issue still exists please feel free to reopen with the requested information. Also, if you could, please test against the latest development version of Ubuntu, since this confirms the bug is one we may be able to pass upstream for help.

Changed in xserver-xorg-video-ati (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.