(Needs kernel 2.6.30-rc8) Text corruption with latest UXA-default Intel driver (UXA bug)

Bug #383411 reported by Michael Terry
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
xf86-video-intel
Fix Released
High
xserver-xorg-video-intel (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

Binary package hint: xserver-xorg-video-intel

After updating my Karmic machine to the new UXA-by-default Intel driver, I'm seeing some odd text corruption in places. I'm attaching a screenshot.

In the screenshot, the text is supposed to say "Here's a patch to add info codes for the various upload worker thread events. Useful for showing a different status message in Deja Dup when we're uploading."

Reloading the page doesn't make it go away. I've seen it on other web pages too, where some sections of text, but not all are corrupted. I've also seen it on the gnome-screensaver 'lock screen' login window. So I don't think it's Firefox-only. Some pango thing?

ProblemType: Bug
Architecture: i386
Date: Wed Jun 3 19:34:15 2009
DistroRelease: Ubuntu 9.10
MachineType: Dell Inc. Inspiron 1420
Package: xserver-xorg-video-intel 2:2.7.99.1+git20090602.ec2fde7c-0ubuntu1
ProcCmdLine: root=UUID=03105272-3f8b-4e3f-aeb4-611d88c12206 ro quiet splash
ProcEnviron:
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.30-7.8-generic
RelatedPackageVersions:
 xserver-xorg 1:7.4~5ubuntu20
 libgl1-mesa-glx 7.4.1-1ubuntu1
 libdrm2 2.4.11-0ubuntu1
 xserver-xorg-video-intel 2:2.7.99.1+git20090602.ec2fde7c-0ubuntu1
 xserver-xorg-video-ati 1:6.12.2-1ubuntu1
SourcePackage: xserver-xorg-video-intel
Uname: Linux 2.6.30-7-generic i686
dmi.bios.date: 05/23/2007
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A00
dmi.board.name: 0DT492
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA00:bd05/23/2007:svnDellInc.:pnInspiron1420:pvr:rvnDellInc.:rn0DT492:rvr:cvnDellInc.:ct8:cvr:
dmi.product.name: Inspiron 1420
dmi.sys.vendor: Dell Inc.
fglrx: Not loaded
fglrx-loaded: Error: command ['grep', 'fglrx', '/var/log/kern.log', '/proc/modules'] failed with exit code 1:
system:
 distro: Ubuntu
 architecture: i686kernel: 2.6.30-7-generic

Revision history for this message
In , Remi (remi) wrote :

Just to clarify the bug report a little, this bug is not specific to OOo. I had it in gitk and firefox. As for pointing at the glyph cache, it's because in all the reports, it seems that text pixmaps are impacted first.

But on my own laptop, I've sometimes seen corruption of small pixmaps such as thumbnails in firefox.

In any case, the corruption seems to happen when the system memory is under heavy load.

FWIW, here's a fedora bug report that looks identical : https://bugzilla.redhat.com/show_bug.cgi?id=495323

Thanks

Revision history for this message
In , Hubert Figuiere (hub) wrote :

as I was mentionning on the RedHat bug report, I was hit by this bug faster when I only had 768MB.

Revision history for this message
In , Faibistes (faibistes) wrote :

Same thing here. Ubuntu Jaunty. Didn't happen with Ubuntu stock drivers+kernel, but started happening on some apps (mostly, but not only, with fonts) after upgrading kernel to 2.6.29-02062903-generic and drivers to 2.7.1-0ubuntu1~xup~1.

Affected apps include Firefox, Ooo, Lotus Notes 8.5., gnome-terminal.

Section "Device"
        Identifier "Configured Video Device"
        Option "AccelMethod" "uxa"
        Option "EXAOptimizeMigration" "true"
        Option "MigrationHeuristic" "greedy"
        Option "Tiling" "false"
EndSection

Revision history for this message
In , Faibistes (faibistes) wrote :

(In reply to comment #3)
> Same thing here. Ubuntu Jaunty. Didn't happen with Ubuntu stock drivers+kernel,
> but started happening on some apps (mostly, but not only, with fonts) after
> upgrading kernel to 2.6.29-02062903-generic and drivers to
> 2.7.1-0ubuntu1~xup~1.
>
> Affected apps include Firefox, Ooo, Lotus Notes 8.5., gnome-terminal.
>
> Section "Device"
> Identifier "Configured Video Device"
> Option "AccelMethod" "uxa"
> Option "EXAOptimizeMigration" "true"
> Option "MigrationHeuristic" "greedy"
> Option "Tiling" "false"
> EndSection
>

Edit: When I experienced the issue, the original xorg.conf had Tiling=true, I've changed it to see if it's a valid workaround. It hasn't happened (yet) with Tiling=false, but it may happen anyway. It takes some time.

Revision history for this message
In , Faibistes (faibistes) wrote :

(In reply to comment #4)
> (In reply to comment #3)
> > Same thing here. Ubuntu Jaunty. Didn't happen with Ubuntu stock drivers+kernel,
> > but started happening on some apps (mostly, but not only, with fonts) after
> > upgrading kernel to 2.6.29-02062903-generic and drivers to
> > 2.7.1-0ubuntu1~xup~1.
> >
> > Affected apps include Firefox, Ooo, Lotus Notes 8.5., gnome-terminal.
> >
> > Section "Device"
> > Identifier "Configured Video Device"
> > Option "AccelMethod" "uxa"
> > Option "EXAOptimizeMigration" "true"
> > Option "MigrationHeuristic" "greedy"
> > Option "Tiling" "false"
> > EndSection
> >
>
> Edit: When I experienced the issue, the original xorg.conf had Tiling=true,
> I've changed it to see if it's a valid workaround. It hasn't happened (yet)
> with Tiling=false, but it may happen anyway. It takes some time.
>
Edit2: The bug is reproducible with Tiling=false, too

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Can anyone reproduce the problem after disabling swapping (doing swapoff on their swap partitions/files)?

Revision history for this message
In , Vytas (vytautas1987) wrote :

Swapoff -a and still reproduced white stripes bug version instantly with horizontal scroolbar. Maybe Even easier to reproduce now.

Revision history for this message
In , Kjb (kjb) wrote :

Vytautas: this bug is about font glyph rendering errors and not about scrollbars. I suppose you're looking for an answer to a different bug.

I've turned of swap and have not seen the font problem for about a day now. (most of the time, I noticed some odd glyphs within a few hours). It will take a few days before I can be really sure, but it's looking good right now.
Of course, I'd like the option to swap back ;)

Jesse: I'm very curious about the relation between the glyph cache and whether or not swap is enabled.

Revision history for this message
In , Dark-shadow (dark-shadow) wrote :

Created an attachment (id=26061)
Screenshot showing corruption in Mozilla Firefox

Hi, I guess I have the same problem (VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 0c)). It occurs in Firefox and Emacs-23 after some time. Nothing in dmesg, apart from this everything works fine.

Using current git versions of drm, mesa, xf86-video-intel and linux-2.6.29 (patched with tuxonice).

I will check if it happens without swap too.

Revision history for this message
In , Vytas (vytautas1987) wrote :

I do not have those crazy letters and numbers anymore without swap.
Looks like good override.

BUT I still have white stripes and colorful stripes. Should i submit other bug?
Check my images.

Revision history for this message
In , Raul Sanchez Siles (rasasi78) wrote :

Created an attachment (id=26139)
Severe font corruption.

Hello all:

This is a screenshot of what I found after having my laptop unattended all night. This is a severe case, but I usually had minor issues on certain glyphs, similar to the other screenshot in the bug.

GM965GM, intel driver 2.7.99.1,linux 2.6.29.3 +TuxOnIce noKMS, libdrm 2.4.11, mesa 7.4.1

If you need xorg conf or log, please let me know.

I also had this starting from 2.7.0 already using UXA, when I upgraded to 2.7.99.1 things improved a little, but problem is still there. I did noticed then that it should be related somehow to memory management, indeed I went to the IRC channel with that suspicion, but I had not much information from there. On high memory usage problem increased and doing some memory rotation, i.e.: reusing an application that has been idle for a while, affected the font rendering.

After reading this bug I swapoff -a and things did improve. I rarely see any of this corruption, but I still can notice some glitches, for instance the '[]' chars in this form are not those but just noise.

I'm also very curious how swapping affects font rendering, so I'd appreciate some note about it.

HTH,

Revision history for this message
In , Vytas (vytas) wrote :

If I disable swap, I can't reproduce this issue, but then the system comes to a complete grind instead. The X server (VIRT) memory usage climbs up slowly but steadily all the time to something like 700M and then (since I have 1G RAM) system either becomes unresponsive (w/o swap), or some memory is swapped to the disk, but glyphs are beginning to deform.
I understand virtual memory of the process may include some mmap-ed stuff etc, but still growing to 700M+ seems weird, aren't the any (video?) memory leaks in the pixmap managing of the new intel drivers?

Revision history for this message
In , Dark-shadow (dark-shadow) wrote :

Like Vytas posted in comment #12, I also notice improvement when deactivating swap, but the system will become more and more slow to respond, and I can see heavy disk activity especially when compiling things. Keyboard input and responses are delayed by about half a minute (getting worse by the time).

Revision history for this message
In , Hubert Figuiere (hub) wrote :

As I said on the RedHat bug report, it happened faster when I only had 768MB than 1.5GB, still with the same amount of swap on the same hardware.

And since I disabled KMS at boot up, it no longer happen.

Revision history for this message
In , Vytas (vytautas1987) wrote :

I reproduced bug at full effect without swap under heavy load then compiling things and working with OOo at same time.

Revision history for this message
In , Vytas (vytautas1987) wrote :

Created an attachment (id=26213)
same bug or other here?

I just selected many cells many times and here is is 100% reproducable colorfull stripes (blue ones).

Revision history for this message
In , Eric Anholt (eric-anholt) wrote :

Vytautas: Does the following patch queued up to for-linus in the kernel help you?

commit 07f4f3e8a24138ca2f3650723d670df25687cd05
Author: Kristian Høgsberg <email address hidden>
Date: Wed May 27 14:37:28 2009 -0400

    i915: Set object to gtt domain when faulting it back in

    When a GEM object is evicted from the GTT we set it to the CPU domain,
    as it might get swapped in and out or ever mmapped regularly. If the
    object is mmapped through the GTT it can still get evicted in this way
    by other objects requiring GTT space. When the GTT mapping is touched
    again we fault it back into the GTT, but fail to set it back to the
    GTT domain. This means we fail to flush any cached CPU writes to the
    pages backing the object which will then happen "eventually", typically
    after we write to the page through the uncached GTT mapping.

    [anholt: Note that userland does do a set_domain(GTT, GTT) when starting
    to access the GTT mapping. That covers getting the existing mapping of the
    object synchronized if it's bound to the GTT. But set_domain(GTT, GTT)
    doesn't do anything if the object is currently unbound. This fix covers the
    transition to being bound for GTT mapping.]

    Fixes glyph and other pixmap corruption during swapping. fd.o bug #21790

    Signed-off-by: Kristian Høgsberg <email address hidden>
    Signed-off-by: Eric Anholt <email address hidden>

(swapping isn't the only case that this bug can fix, but it's the most common as the cpu cache of the object will be hot with writes at the time we don't want it)

Revision history for this message
In , Vytas (vytautas1987) wrote :

Sorry I do not know how to test it. If you give detailed instructions I will test in about week time. Still I know how to compile kernel.

Revision history for this message
In , Raul Sanchez Siles (rasasi78) wrote :

Vytautas:

You'd need to clone latest linus tree[0] once the commit is applied, build the kernel and try.

Or alternatively try the drm-intel[1] kernel branch where I see it applied.

Tree should be [0]http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=summary

[1]http://git.kernel.org/?p=linux/kernel/git/anholt/drm-intel.git;a=summary

Revision history for this message
In , Vytas (vytautas1987) wrote :

Can I use Gentoo git-sources? (http://gentoo-portage.com/sys-kernel/git-sources).
Can you post here rc number when it will be ready (applied)?

Revision history for this message
In , Remi (remi) wrote :

(In reply to comment #20)
> Can I use Gentoo git-sources?
> (http://gentoo-portage.com/sys-kernel/git-sources).

Not yet. But you can just "git clone" Eric's repo from /usr/src to try it out and then remove it when you're done. You can even use "kernel-config" to make it the default kernel source directory.

(In reply to comment #17)
> Vytautas: Does the following patch queued up to for-linus in the kernel help
> you?

Eric, this patch works for me, I've tried thrashing my laptop's memory and I couldn't reproduce the bug. Looks really good.

Thanks for solving this

Revision history for this message
In , Dark-shadow (dark-shadow) wrote :

The patch solved it for me too. Thanks!

Revision history for this message
In , Dark-shadow (dark-shadow) wrote :

While the above patch indeed fixed the fonts problem,
my system also seems to suffer from the problem described
in bug #20766. Just in case anyone else has similar
issues...

Revision history for this message
In , Fut-gmx (fut-gmx) wrote :

Eric, your patch seems to fix this problem for me as well. Thanks a lot!

Dark Shadow, I also had the memory leak problem with 2.6.29. I got the impression that it's much better with 2.6.30-rc7. The number of objects (/proc/dri/0/gem_objects) is still high, but the "object bytes" aren't as high.

Revision history for this message
In , Raul Sanchez Siles (rasasi78) wrote :

I managed to apply the patch on 2.6.29.4, it also solves the problem. I also hope it doesn't have any collateral effect.

Thanks.

Revision history for this message
Michael Terry (mterry) wrote : Text corruption with latest UXA-default Intel driver

Binary package hint: xserver-xorg-video-intel

After updating my Karmic machine to the new UXA-by-default Intel driver, I'm seeing some odd text corruption in places. I'm attaching a screenshot.

In the screenshot, the text is supposed to say "Here's a patch to add info codes for the various upload worker thread events. Useful for showing a different status message in Deja Dup when we're uploading."

Reloading the page doesn't make it go away. I've seen it on other web pages too, where some sections of text, but not all are corrupted. I've also seen it on the gnome-screensaver 'lock screen' login window. So I don't think it's Firefox-only. Some pango thing?

ProblemType: Bug
Architecture: i386
Date: Wed Jun 3 19:34:15 2009
DistroRelease: Ubuntu 9.10
MachineType: Dell Inc. Inspiron 1420
Package: xserver-xorg-video-intel 2:2.7.99.1+git20090602.ec2fde7c-0ubuntu1
ProcCmdLine: root=UUID=03105272-3f8b-4e3f-aeb4-611d88c12206 ro quiet splash
ProcEnviron:
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.30-7.8-generic
RelatedPackageVersions:
 xserver-xorg 1:7.4~5ubuntu20
 libgl1-mesa-glx 7.4.1-1ubuntu1
 libdrm2 2.4.11-0ubuntu1
 xserver-xorg-video-intel 2:2.7.99.1+git20090602.ec2fde7c-0ubuntu1
 xserver-xorg-video-ati 1:6.12.2-1ubuntu1
SourcePackage: xserver-xorg-video-intel
Uname: Linux 2.6.30-7-generic i686
dmi.bios.date: 05/23/2007
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A00
dmi.board.name: 0DT492
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA00:bd05/23/2007:svnDellInc.:pnInspiron1420:pvr:rvnDellInc.:rn0DT492:rvr:cvnDellInc.:ct8:cvr:
dmi.product.name: Inspiron 1420
dmi.sys.vendor: Dell Inc.
fglrx: Not loaded
fglrx-loaded: Error: command ['grep', 'fglrx', '/var/log/kern.log', '/proc/modules'] failed with exit code 1:
system:
 distro: Ubuntu
 architecture: i686kernel: 2.6.30-7-generic

Revision history for this message
Michael Terry (mterry) wrote :
Revision history for this message
Bryce Harrington (bryce) wrote :

Strange, but I saw this too last night, but I was running the 2:2.7.99.1+git20090526.r1.8e942b70-0ubuntu0sarvatt driver, which I'd been running for several weeks without seeing the behavior, so I wonder if it might be due to something other than the driver?

Here are the things I installed/upgraded yesterday prior to noticing the regression:

2009-06-03 19:06:43 install gnuplot-nox <none> 4.2.5-2
2009-06-03 19:06:45 install libwxbase2.8-0 <none> 2.8.9.1-0ubuntu6
2009-06-03 19:06:45 install libwxgtk2.8-0 <none> 2.8.9.1-0ubuntu6
2009-06-03 19:06:47 install gnuplot-x11 <none> 4.2.5-2
2009-06-03 19:06:48 install gnuplot <none> 4.2.5-2
2009-06-03 19:06:49 install groff <none> 1.18.1.1-22build1
2009-06-03 19:06:50 install psutils <none> 1.17-26
2009-06-03 19:06:52 trigproc man-db 2.5.5-1build1 2.5.5-1build1
2009-06-03 19:07:00 startup packages configure
2009-06-03 19:07:00 configure gnuplot-nox 4.2.5-2 4.2.5-2
2009-06-03 19:07:00 configure libwxbase2.8-0 2.8.9.1-0ubuntu6 2.8.9.1-0ubuntu6
2009-06-03 19:07:01 configure libwxgtk2.8-0 2.8.9.1-0ubuntu6 2.8.9.1-0ubuntu6
2009-06-03 19:07:01 configure gnuplot-x11 4.2.5-2 4.2.5-2
2009-06-03 19:07:01 configure gnuplot 4.2.5-2 4.2.5-2
2009-06-03 19:07:01 configure groff 1.18.1.1-22build1 1.18.1.1-22build1
2009-06-03 19:07:01 configure psutils 1.17-26 1.17-26
2009-06-03 19:07:01 trigproc libc6 2.9-9ubuntu1 2.9-9ubuntu1

None of those leap out as me as an obvious candidate. psutils is the only thing close but that should only affect postscript docs, whereas I saw the corruption in regular HTML pages (launchpad in fact).

Are you able to make the issue go away by downgrading -intel to 2.7.1? (Available in x-updates PPA) If that does not make the issue go away, can you review your /var/log/dpkg.log to see if there are other packages updated in the timeframe that might be candidates?

Changed in xserver-xorg-video-intel (Ubuntu):
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Michael Terry (mterry) wrote :

A further note. Once I started seeing this, Firefox quickly got worse and worse. Almost all text became corrupted. A reboot fixed it. That machine had been running for quite some time. Maybe it's 'time based' in the sense that you're more likely to hit it the longer you are running?

I'll try downgrading.

Revision history for this message
Michael Terry (mterry) wrote :

After downgrading, I haven't seen the issue yet.

Changed in xserver-xorg-video-intel:
status: Unknown → Confirmed
Revision history for this message
Bryce Harrington (bryce) wrote :

Thanks for testing that Michael.

I've forwarded the bug upstream, to https://bugs.freedesktop.org/show_bug.cgi?id=22111 - please subscribe to that bug in case upstream needs further information or wishes you to test something.

Fwiw, the system I'm also seeing the corruption also happens to be a 1420.

Revision history for this message
Milan Bouchet-Valat (nalimilan) wrote :

I'm seeing the same kind of symptoms too, and your observations fit mine. I can note that I've seen that behavior for at least two weeks. Jaunty, i915 GM, driver 2.7.99 from two days ago. Upstream asked me to try with kernel 2.6.30rc8, which I'm doing now.

Revision history for this message
In , Éric Piel (pieleric) wrote :

Created an attachment (id=26526)
Example of font corruption

Strangely, I'm still seeing this bug, although I'm using kernel 2.6.30-rc8 (which contains commit 07f4f3e8a24138ca2f3650723d670df25687cd05). Similarly, doing a "swapoff -a" fixes the problem.

It's with the intel driver 2.7.1, and a chipset "965GM", using KMS. Is there something else that I should update to fix the bug?

Revision history for this message
In , Carl Worth (cworth) wrote :

*** Bug 22111 has been marked as a duplicate of this bug. ***

Changed in xserver-xorg-video-intel:
status: Confirmed → Invalid
Revision history for this message
Milan Bouchet-Valat (nalimilan) wrote : Re: Text corruption with latest UXA-default Intel driver

Just for memory, upstream's 'invalid' comes form the fact this bug is a duplicate of a previous one that seems to have been fixed in kernel 2.6.30rc8.

Revision history for this message
In , Carl Worth (cworth) wrote :

*** Bug 22118 has been marked as a duplicate of this bug. ***

Revision history for this message
Bryce Harrington (bryce) wrote : Re: Text corruption with latest UXA-default Intel driver (UXA bug)

Michael, any chance you could confirm the issue is gone with the -rc8 kernel?

summary: - Text corruption with latest UXA-default Intel driver
+ Text corruption with latest UXA-default Intel driver (UXA bug)
Changed in xserver-xorg-video-intel:
status: Invalid → Unknown
summary: - Text corruption with latest UXA-default Intel driver (UXA bug)
+ (Needs kernel 2.6.30-rc8) Text corruption with latest UXA-default Intel
+ driver (UXA bug)
Changed in xserver-xorg-video-intel:
status: Unknown → Fix Released
Revision history for this message
Michael Terry (mterry) wrote :

Yeah, I haven't seen it in a while. I upgraded to test and have been meaning to report back. I'd vote to close.

Revision history for this message
Bryce Harrington (bryce) wrote :

Great, thanks, will do.

Changed in xserver-xorg-video-intel (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
In , Byron Clark (byron-theclarkfamily) wrote :

I'm still seeing this bug with linux 2.6.30 and intel driver 2.7.1. It does seem harder to trigger, but it still happens.

Revision history for this message
In , Byron Clark (byron-theclarkfamily) wrote :

(In reply to comment #29)
> I'm still seeing this bug with linux 2.6.30 and intel driver 2.7.1. It does
> seem harder to trigger, but it still happens.
>

I'm only seeing the corruption in firefox, but it appears that focusing a different window and then returning the focus to firefox corrects the corrupted glyphs.

Revision history for this message
In , Remi (remi) wrote :

(In reply to comment #30)
> I'm only seeing the corruption in firefox, but it appears that focusing a
> different window and then returning the focus to firefox corrects the corrupted
> glyphs.

Looks like a different bug, please file a new one so your issue gets looked at.

Thanks

Changed in xserver-xorg-video-intel:
importance: Unknown → High
Changed in xserver-xorg-video-intel:
importance: High → Unknown
Changed in xserver-xorg-video-intel:
importance: Unknown → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.