Ubuntu

[KMS] gem objects not deallocated

Reported by Tormod Volden on 2010-04-18
532
This bug affects 91 people
Affects Status Importance Assigned to Milestone
xorg-server (Ubuntu)
Critical
Unassigned
Lucid
Critical
Unassigned

Bug Description

[Problem]
Memory leak. Fix to glx 1.4 backport did not deallocate gem objects properly.

[Background]
Red Hat backported glx 1.3 and 1.4 support from xserver 1.8. These patches were taken by Debian as patches 03_fedora_glx_versioning.diff and 04_fedora_glx14-swrast.diff, and so Ubuntu took them in order to remain in sync with Debian. Other distros using xserver 1.7 have likewise adopted these backports.

Subsequent testing by Ubuntu identified an xserver crash that occurs with these patches enabled when closing Clutter apps. A partial fix was implemented in Ubuntu based on upstream work, and the issue believed solved, but further testing has shown that a slow memory leak is present, causing issues such as described below, which can result in system instability after a day or two of uptime (depending on memory quantity and usage). Distros that don't include support for Clutter obviously won't see the bugs.

Following these findings, Debian has dropped the glx patches. Ubuntu is evaluating fixing the patches vs. following Debian's approach, being mindful of any userspace apps that may have come to depend on glx 1.3/1.4 functionality.

[Original Report]
There has been some buzz the last days about excessive swapping and OOM conditions. It can seem like the kernel memory use is increasing since the user processes seem not to grow unusually.

/sys/kernel/debug/dri/0/gem_objects shows that the GEM object bytes number is increasing. One way to reproduce, is this:

$ for t in `seq 1 10`; do eog /usr/share/backgrounds ; echo `grep "object bytes" /sys/kernel/debug/dri/0/gem_objects` `ps --noheaders ocomm,vsz,rss $(pidof X)`; done
142376960 object bytes Xorg 25812 15372
145907712 object bytes Xorg 25812 15372
150458368 object bytes Xorg 25812 15372
154816512 object bytes Xorg 25812 15372
159244288 object bytes Xorg 25812 15372
163721216 object bytes Xorg 25812 15372
168148992 object bytes Xorg 25812 15372
172699648 object bytes Xorg 25812 15372
177152000 object bytes Xorg 25812 15372
181530624 object bytes Xorg 25812 15372

It shows that the Xorg process is not growing, but gem objects are. Similarly counting and summing objects show there are gem objects adding up (with refcount 2) but not disappearing again when the application closes:
 awk '/name/{ i++; s+= $4 } END{print i " " s}' /sys/kernel/debug/dri/0/gem_names

These issues have been seen on intel and ati, with the lucid kernel as well as the mainline 2.6.34 snapshot.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: xorg 1:7.5+5ubuntu1
Uname: Linux 2.6.34-999-generic i686
Architecture: i386
Date: Sun Apr 18 15:59:21 2010
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: xorg
system:
 distro: Ubuntu
 codename: lucid
 architecture: i686
 kernel: 2.6.34-999-generic

Tormod Volden (tormodvolden) wrote :
affects: xorg (Ubuntu) → linux (Ubuntu)
summary: - gem objects not deallocated
+ [KMS] gem objects not deallocated
Tormod Volden (tormodvolden) wrote :

Restarting Xorg brings the gem object bytes down again, so I am not sure xorg-server (or driver) is not to blame.
$ cat /sys/kernel/debug/dri/0/gem_objects
156 objects
54063104 object bytes

marcinq (marcinq) wrote :

I can confirm this on my rs690 (radeon x1200).

With kubuntu and normal use 5 hours are enough to skyrock ram usage from initial 300-400 mb to 1.2 gb. In the end it all ends up with heavy swaping or a lockup.

No such problems with UMS.

Bryce Harrington (bryce) wrote :

Anyone happen to know when this issue first began?

Can someone try booting earlier kernels to find which are affected? (Assuming it to be a regression in the kernel)

If it sounds more like xorg-server, try booting earlier versions of it. There were a number of changes between April 15th and now which looked safe when we pulled them but would be worth ruling in or out as possibilities.

Changed in linux (Ubuntu Lucid):
importance: Undecided → Critical
William Grant (wgrant) wrote :

xorg-server -2ubuntu1 is good, -2ubuntu2 is bad. 114_dri2_make_sure_x_drawable_exists.patch is probably to blame.

Tormod Volden (tormodvolden) wrote :

In that case I think the discussion here is relevant: https://bugs.freedesktop.org/show_bug.cgi?id=26394 and the "Track DRI2 drawables as resources, not privates" thread on xorg-devel ML, with four glx commits on xserver master 2010-04-16.

William Grant (wgrant) on 2010-04-20
affects: linux (Ubuntu Lucid) → xorg-server (Ubuntu Lucid)
Vish (vish) wrote :

Downgrading to xorg-server -2ubuntu1 does prevent the memory problems .

Changed in xorg-server (Ubuntu Lucid):
status: New → Confirmed
Sebastian Martinez (tychocity) wrote :

perhaps is my problem too

cat /sys/kernel/debug/dri/0/gem_objects
3640 objects
1269358592 object bytes
4 pinned
13766656 pin bytes
111054848 gtt bytes
234881024 gtt total

Tormod Volden (tormodvolden) wrote :

Running xorg-edgers xserver 1.8 now with 114_dri2_make_sure_x_drawable_exists.patch dropped (thanks Sarvatt!) and the problem can not be reproduced.

Robert Hooker (sarvatt) wrote :

Well there are two options here, backport the changes from xserver master that fix this instead of using the 114 patch, or dropping the 2 glx 1.4 enablement patches and 114 completely. Just dropping 114 is not an option because it will regress things horribly to the point where closing clutter apps crashes the server.

I tried my hand at backporting the 2 commits here (xorg-server - 2:1.7.6-2ubuntu7.5) -
https://edge.launchpad.net/~sarvatt/+archive/bugs/+packages

but the patches need some *serious* review and it is only compile tested at the moment. This only affects people using the glx 1.4 enablement backports to xserver 1.7.x so it's not really upstream material.

The two patches:
http://sarvatt.com/downloads/patches/119_dri2_drawables.patch
http://sarvatt.com/downloads/patches/120_glx_drop_destroywindow.patch

The slightly more sane option I see at the moment is to revert the 2 glx 1.4 enablement patches as well as the 114 patch that only mattered for things using that. I have uploaded that combination to x-updates here -

https://edge.launchpad.net/~ubuntu-x-swat/+archive/x-updates

Robert Hooker (sarvatt) wrote :

By the way, this is the upstream bug where the 114 patch originated from

https://bugs.freedesktop.org/show_bug.cgi?id=26394

Robert Hooker (sarvatt) wrote :

And this is the bug which the 114 patch fixed from which will regress if it is just dropped without dropping 03_fedora_glx_versioning.diff and 04_fedora_glx14-swrast.diff

https://bugs.edge.launchpad.net/ubuntu/+bug/550218

Vish (vish) wrote :

I tried both the ppa's
https://edge.launchpad.net/~ubuntu-x-swat/+archive/x-updates (xorg-server - 2:1.7.6-2ubuntu7~xup)
https://edge.launchpad.net/~sarvatt/+archive/bugs/+packages (xorg-server - 2:1.7.6-2ubuntu7.5)

Both cause severe problems for me and couldnt use the system for more than 5mins.
Everything would lockup at some compiz use and i can not do anything , the screen just froze and I couldnt return to VT nor did Alt+SysRq+K help.
Hard to hard shutdown the system.
Probably not of much use but , from /var/log/messages http://paste.ubuntu.com/419425/ http://paste.ubuntu.com/419432/
[i had to hard shutdown nearly 10times , seems the SAK worked a couple of times in the background , but the screen was frozen ]

Robert Hooker (sarvatt) wrote :

Yeah sorry about that, the one I uploaded to ubuntu-x-swat had local changes in it by mistake that would have made it make no difference. The fixed one is in there now (2ubuntu7~xup2)

Adam Lyall (magicmyth) wrote :

Just confirming that Robert's xorg packages (2ubuntu7~xup2) have fixed the issue I mentioned in #564636. So far no stability issues. I will report back on whether it fixes the ATI GPU issue when I get the chance to test that.

Petar Velkovski (pvelkovski) wrote :

Preliminary testing shows that Robert's xorg packages from https://launchpad.net/~ubuntu-x-swat/+archive/x-updates fix this bug for me too. (Intel graphic)

Martin Pitt (pitti) wrote :

Robert, from the current discussion it seems that it's quite safe to roll back the two glx 1.4 and the 114 patch. Personally I would rather like to see this fixed in final, since it's such a notable regression and the 114 patch was just introduced a few days ago.

I heard that the rdepends were tested how they behave wrt. rolling back GLX from 1.4 to 1.2. From a more theoretical standpoint, what does that change entail? Does it drop a few GLX features which would help performance improvements in some cases? What do client apps do if those functions are suddenly not available any more?

Martin Pitt (pitti) wrote :

I started a testing wiki page at https://wiki.ubuntu.com/X/Testing/GEMLeak

I'll send a call for testing to ubuntu-devel@.

I added the ppa, I did the upgrade and reboot but glxinfo | grep "GLX version" still says 1.4
Is it right? I ask because the wiki says "Please verify that glxinfo | grep "GLX version" says "1.2", not "1.4"."
I've xserver-common and xserver-xorg-core 2:1.7.6-2ubuntu7~xup2 installed.

Timo Jyrinki (timo-jyrinki) wrote :

As a small note, applications requiring GLX 1.4 generally do not start if the 1.4-specific extensions are not available. I never got to studying which are such applications, even though it crossed my mind. Probably games, some professional proprietary applications etc. Quite a few applications depend on those, since GLX 1.4 is ca. 10 years old, even though the free software 3D stack hasn't supported it.

Erick Brunzell (lbsolost) wrote :

I caught wind of this at the forums:

http://ubuntuforums.org/showthread.php?p=9154355&posted=1#post9154355

And as I said there, "Please excuse me for being a pain but being visually impaired I can sometimes overlook the obvious, but my blind old self can't see is how to add myself and my machine to "testing" here":

https://wiki.ubuntu.com/X/Testing/GEMLeak

So I'll just be using this bug report in the interim.

I can tell you that I'm using Intel 82945G/GZ Integrated Graphics and <glxinfo | grep "GLX version"> produces
"GLX version: 1.4" in an UNAFFECTED fresh install from 04/01/2010, but it DOES affect an upgrade from Karmic to Lucid that was performed less than 48 hours ago!

I'll leave this well working install alone and start testing the other.

Uptime on the non-effected install previously mentioned:

lance@lance-desktop:~$ uptime
 13:02:16 up 4:34, 2 users, load average: 0.34, 0.60, 0.49

13+ hours UP and np problem with "GLX version: 1.4"!

The other had been UP not nearly as long!

I'll stay available!

Erick Brunzell (lbsolost) wrote :

Something possibly helpful, or not, since I'm visually impaired one of the first things I do is right click the desktop, adjust fonts, etc, and particularly DISABLE the 3D stuff. That is, even if enabled by default, if Visual Effects shows "Normal", I change it to "None".

That's how the unaffected Lucid desktop was prior to booting into the affected one. I tried to change it back to "Normal" and it wouldn't!

I booted into the affected one and checked. It is also set to "None" and I think I'll leave it alone and boot into one of the installs I did during iso testing to try the "patch/revert". That way I'll still be able to gather info from the one that slowed to a crawl and the one that seems unaffected.

Someone much smarter than I am would have to tell me what info gather, and how to gather it.

Vish (vish) wrote :

Using 2ubuntu7~xup2 , I'm not having any memory problems i had reported earlier Bug #563400 (thanks Sarvatt!)
This is on an ATI X1400 Mobility radeon [RV515]

Luka (luka-mrovlje-gmail) wrote :

i'd like to report that I do not notice this regressions. My packages are update. I do not use proposed ppa to downgrade glx to 1.2.

My video adapter:
VGA compatible controller: Intel Corporation Mobile GME965/GLE960 Integrated Graphics Controller (rev 0c)

os: lucid lynx 10.04 32bit Desktop which was upgraded from karmic.

important parts of glxinfo report:
direct rendering: Yes
server glx version string: 1.4
client glx version string: 1.4
GLX version: 1.4

Running compiz, my uptime is about 4 days and I use suspend that works great. There are no speed regressions to report. In fact this is the first time that I am able to play opengl games like nexuiz and penumba rather smoothly. But I do experience occasional xorg restart, but only when playing penumbra and that is probably not related to this bug.

I am sure that doesn't help much those affected, but I hope someone will have some use from a good report as well.

Erick Brunzell (lbsolost) wrote :

OK, while installing packages on my "third" Lucid so I can just use it, I browsed the apt history files of the others to see if that might shed some light. It did to me, but it may not be helpful to you.

On my "main" Lucid (the one NOT affected) I'd let update mangler remove "compiz" and "compiz-gnome" on 04/09/2010 (obviously because I don't use them anyway) so that's why it was NOT affected.

Of course we don't want everyone to remove compiz so I'll keep running Lucid #3 with the reversion/patch + desktop effects enabled and report back tomorrow.

Andy (andy-xillean) wrote :

I am running Lucid in Virtualbox with Compiz enabled. Host Machine is Nvidia. I just ran all the updates
then rebooted and checked my version

ubuntu@lucid-test:~$ glxinfo|grep "version"
server glx version string: 1.2 Chromium
client glx version string: 1.2 Chromium
GLX version: 1.3
OpenGL version string: 2.0 Chromium 1.9

I ran the update manager again and it said the system is up to date. I am not noticing any problems
But shouldn't it be showing 1.4 instead of 1.3 ?

W00ster (svein-brostigen) wrote :

Running Lenovo ThinkPad T400, model 6475ZN2 with the Xserver from https://launchpad.net/~ubuntu-x-swat/+archive/x-updates.

After rebooting and starting the new Xserver, running some 1080p video in full screen and some other applications normally taxing the Xerver, GLX and memory quite hard, the number of bytes used by GEM objects have dropped to a more normal level:
1198 objects
126042112 object bytes
6 pinned
16838656 pin bytes
79060992 gtt bytes
234881024 gtt total

The number of object bytes used to be larger by a factor of 10.

Bryce Harrington (bryce) on 2010-04-21
description: updated
Bryce Harrington (bryce) on 2010-04-21
description: updated
Bryce Harrington (bryce) on 2010-04-21
description: updated
description: updated
Kees Cook (kees) on 2010-04-21
description: updated

On Wed, Apr 21, 2010 at 09:19:49PM -0000, Andy wrote:
> I am running Lucid in Virtualbox with Compiz enabled. Host Machine is Nvidia. I just ran all the updates
> then rebooted and checked my version

The -nvidia binary driver includes its own GLX library, so this bug is
completely irrelevant in that case. Virtualbox also includes its own
video driver, although dunno what it does for glx. In any case, your
configuration does not sound like one that requires being tested, but
thanks for the feedback.

discord (colin.williams) wrote :

I've been running the beta for a couple of months,

00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03)

and I haven't been affected either. I tried the script above to test, but the script didn't work. Seems there's no bug on the 945 and 965 chipsets.

discord (colin.williams) wrote :

whoops, thought I wasn't affected, but after running a clip through vlc for half an hour, my memory usage started to increase sharply. I'm not sure this has always been a problem since I've been running testing for quite awhile and watched dvds without issue a couple of times..

Conn O Griofa (psyke83) wrote :

Unfortunately, this proposed X server update has not resolved the problem on my system. I'm using a stock Ubuntu Lucid installation, compiz enabled, and with the proposed X server update.

conn@nx9010:~$ lspci | grep VGA
01:05.0 VGA compatible controller: ATI Technologies Inc Radeon IGP 330M/340M/350M

conn@nx9010:~$ glxinfo | grep "GLX version"
GLX version: 1.2

After approximately one hour uptime (watching a Flash video [1]), I noticed the system becoming sluggish, especially scrolling pages in Firefox. This is the output which I captured shortly after noticing the sluggishness had begun:

conn@nx9010:~$ pid=`pidof X` ; for t in `seq 1 10`; do eog /usr/share/backgrounds ; echo `grep "object bytes" /sys/kernel/debug/dri/0/gem_objects` `ps ocomm,vsz,rss $pid |grep X`; done
290811904 object bytes Xorg 32120 24264
292790272 object bytes Xorg 32120 24264
294481920 object bytes Xorg 32120 24264
295972864 object bytes Xorg 32120 24264
297984000 object bytes Xorg 32120 24264
302874624 object bytes Xorg 32120 24264
304979968 object bytes Xorg 32888 24840
306962432 object bytes Xorg 32120 24264
308514816 object bytes Xorg 32120 24264
310161408 object bytes Xorg 32120 24264

There also seems to be something strange with the GEM pinned/gtt counts:

conn@nx9010:~$ cat /sys/kernel/debug/dri/0/gem_objects
1212 objects
308838400 object bytes
0 pinned
0 pin bytes
0 gtt bytes
0 gtt total

I was aware of the memory leak in KMS for some time, but only discovered this bug report today. Aside from disabling KMS, the only way in which I was able to stop this memory leak was to use a combination of the xorg-edgers packages and the mainline kernel 2.6.34-rc5.

I am unsure which specific component eliminated the problem, but the problem disappeared only after installing kernel 2.6.34-rc5 [2] (which may or may not be coincidence, as an update in one of the xorg-edgers packages may have really solved the issue). I will now test the xorg-edgers packages against the official 2.6.32-21-generic kernel to try to isolate the problem.

If you require any more information or a separate bug report filed, please let me know.

[1] http://www.southparkstudios.com/episodes/103922/
[2] http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.34-rc5-lucid/

Conn O Griofa (psyke83) wrote :

Update on comment #33:

I managed to have a quick word with David Airlie, and he told me that the radeon driver does not report the pinned/pin/gtt values, so that's not an issue.

Here are some results from kernel 2.6.34-rc5 with the latest xorg-edgers packages [1]:

conn@nx9010:~$ glxinfo | grep "GLX version"
GLX version: 1.4

conn@nx9010:~$ pid=`pidof X` ; for t in `seq 1 10`; do eog /usr/share/backgrounds ; echo `grep "object bytes" /sys/kernel/debug/dri/0/gem_objects` `ps ocomm,vsz,rss $pid |grep X`; done
317861888 object bytes Xorg 23524 15108
319795200 object bytes Xorg 23524 15120
321794048 object bytes Xorg 23524 15120
323530752 object bytes Xorg 24292 15696
328589312 object bytes Xorg 24292 15696
329527296 object bytes Xorg 24292 15696
329826304 object bytes Xorg 23524 15120
331714560 object bytes Xorg 23524 15128
333737984 object bytes Xorg 23524 15168
335695872 object bytes Xorg 23524 15168

The results are almost identical to comment #33 - however, I am experiencing absolutely no slowdown or sluggishness despite the same high object count. Perhaps the bug is still present, but the driver or X server is more resistant to the GEM leak, somehow.

[1] current xorg-edgers X server version: 2:1.7.6.901+git20100413+server-1.7-nominations.e7ab6537-0ubuntu0sarvatt3. Upon inspecting the source, this build uses the same patches as mentioned in this bug's description, which probably means that the memory leak is to be expected. Why the slowdown does not occur is still a mystery, though.

Conn O Griofa (psyke83) wrote :

Using kernel 2.6.32-21-generic and xorg-edgers packages:

After just 10 minutes uptime the slowdown has occurred, with a lower object bytes count than before. Here is the output at the point in which the slowdown became noticeable:

conn@nx9010:~$ pid=`pidof X` ; for t in `seq 1 10`; do eog /usr/share/backgrounds ; echo `grep "object bytes" /sys/kernel/debug/dri/0/gem_objects` `ps ocomm,vsz,rss $pid |grep X`; done
156647424 object bytes Xorg 26912 18948
158511104 object bytes Xorg 26912 18948
163356672 object bytes Xorg 27680 19524
162840576 object bytes Xorg 27680 19524
164474880 object bytes Xorg 26912 18948
166895616 object bytes Xorg 21808 13844
168878080 object bytes Xorg 21808 13844
170754048 object bytes Xorg 21808 13844
172756992 object bytes Xorg 21808 13844
174747648 object bytes Xorg 21808 13844

Conclusion: in the case of my particular system, the memory leak is occurring in all cases (with or without the X server update or xorg-edgers packages), but the slowdown can be avoided by using the 2.6.34-rc5 kernel.

I have just one more combination to test: kernel 2.6.34-rc5 + stock drivers (i.e., no xorg-edgers packages) + the proposed X server update. After that, I promise to stop spamming this bug ;).

Conn O Griofa (psyke83) wrote :

Final test: kernel 2.6.34-rc5 + stock drivers + proposed X server update.

X server uptime: approximately 1 hour.

conn@nx9010:~$ glxinfo | grep "GLX version"
GLX version: 1.2

conn@nx9010:~$ pid=`pidof X` ; for t in `seq 1 10`; do eog /usr/share/backgrounds ; echo `grep "object bytes" /sys/kernel/debug/dri/0/gem_objects` `ps ocomm,vsz,rss $pid |grep X`; done
372658176 object bytes Xorg 32400 19844
374898688 object bytes Xorg 32400 19900
376623104 object bytes Xorg 32400 19840
378945536 object bytes Xorg 32400 19800
380846080 object bytes Xorg 32400 19712
382578688 object bytes Xorg 32400 19712
384847872 object bytes Xorg 32400 19712
386646016 object bytes Xorg 32400 19712
388882432 object bytes Xorg 32400 19712
390582272 object bytes Xorg 32400 18480

As you can see, the leak is still occurring. After reaching ~300MB in object bytes, the computer began to slow down due to excessive I/O and disk swapping, so applications were slower to load and respond; however, in periods of calm I/O activity, graphics performance is absolutely fine. I see no graphical slowdown whatsoever!

To summarise:
Kernel 2.6.32-21-generic + proposed X server from this bug report = memory leak, graphical slowdown
Kernel 2.6.32-21-generic + xorg-edgers packages = memory leak, graphical slowdown
Kernel 2.6.34-rc5 + proposed X server from this bug report = memory leak, no discernible graphical slowdown
Kernel 2.6.34-rc5 + xorg-edgers package = memory leak, no discernible graphical slowdown

In other words:
a) I get a memory leak in all cases, with or without the patched X server;
b) the newer kernel eliminates any graphical performance impact from the memory leak.

I'm not sure in what direction to proceed; since it seems that most people are seeing the bug fixed, perhaps this is a separate leak that's not related to 03_fedora_glx_versioning.diff and 04_fedora_glx14-swrast.diff. Shall I file a new bug, and if so, against what component?

Petar Velkovski (pvelkovski) wrote :

Conn O Griofa I believe you are misinterpreting the results (or the test is not reliable indication of the memory leak). This is how I know that the patch works. I have a System Monitor active in my top panel. Right click the System Monitor and select Memory in Monitored Resources (if not already selected).Below for the Colors option choose "Memory" and set 5 different colours for User, Shared, Buffers, Cached, Free. What interests you is the colour for "Cached" memory. Start using your computer. Open Firefox, Office, play a movie and you'll notice that the "Cached" memory increases. Once your Cached memory takes significant amount of RAM, try executing this commands:

sync
sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"

This should clear the cached memory (not entirely, but I suppose this is a normal behaviour). For instance after executing this command on my system the System Monitors says that 27% of my computers memory is used by programs and 18% is used as cache (it was 31% used by programs, 63% used as cache before the execution). And my system uses the xorg package from https://launchpad.net/~ubuntu-x-swat/+archive/x-updates, which solves the memory leak problem for me. If there is a memory leak, the previous command will be able the release only small portion of the cached memory. So even if it manages to release half of the cached memory on the first try, this is still an indication that there is a memory leak.

Also keep in mind that the system is automatically droping the cache memory from time to time. So try this:
1.Open an application(s) using a lot of memory (OpenOffice Write, Firefox, Gimp etc.)
2. execute:
cat /sys/kernel/debug/dri/0/gem_objects | grep 'object bytes'
3. wait for a few seconds (5, 10, 15 sec)
4. repeat procedures 2 and 3 a few times
5. Close the application(s) using a lot of memory
6. Do procedure 2 and 3 (one or two times)

If during following the procedure above the "object bytes" number NEVER decreases then you do have a memory leak. If the number oscillates (for exsample goes up, down, up, up, down) then you are fine and there should be no memory leak.

Conn O Griofa (psyke83) wrote :

Peter Velkovski,

Thanks for the suggestion, but on my system - even if I close all applications on the running X server - the GEM "object bytes" value *never* decreases. Freeing the pagecache, dentries and inodes (as you suggest) makes no difference at any point.

The only way to reduce the ever-increasing GEM allocation is to restart the running X server. At the conclusion of all my tests (comments #33-36), I always stopped GDM, switched to a VT, and checked the values in /sys/kernel/debug/dri/0/gem_objects. With no X server running, the object bytes allocation is usually reduced to ~30MB.

Let me also clarify something: In comments #33-36, I neglected to mention that by "patched X server", I was referring to version 2:1.7.6-2ubuntu7~xup2 from the X-Updates PPA.

I have subsequently tried to test version 2:1.7.6-2ubuntu7.5 from https://edge.launchpad.net/~sarvatt/+archive/bugs and noted the following:

GDM starts up and I can enter my login credentials, but I get stuck at the purple wallpaper with no other graphical elements loading. The pointer is responsive and I can switch to a VT (which I do). At this point I check the kernel and Xorg.0.log, but see no errors or useful output from the tail sections. When I try to switch from the VT back to the running server, my system freezes and I am forced to power cycle the system.

It seems to me that the 2ubuntu7.5 packaged with the backported fixes has the bigger potential for regressions. As for my particular system, I guess that my last recourse it to test a vanilla 1.8 server to see if that fixes the memory leak.

Timo Jyrinki (timo-jyrinki) wrote :

I think there is a slight possibility of mixing multiple slowdown bugs here. If you experience slowdown and suspect it might not all be because of GEM objects, please glance at bug #555595 which was reported before xorg-server ubuntu2 with the 114 patch went in, and claims that 2.6.32-19 kernel introduced a slowdown that was not there in 2.6.32-18, on intel graphics. And I seem to have even something else which feels like sluggish X.org, namely that ondemand on my computer seems to set the CPU speed to 800MHz on lucid. So if you experience slowdowns of X, you may be interested in not only checking this bug report but checking your CPU clock speed when CPU load is there, and maybe pondering about that other bug which relates to a potential kernel regression with intel.

Please report these kernel regression possibilities and possible CPU speed setting failures to the bug #555595 instead of here. The CPU thing probably needs another place as well, but better in that bug report where I've already messed about than this GLX / GEM bug report.

Loïc Minier (lool) wrote :

So I have 2:1.7.6-2ubuntu7~xup2 packages from ppa:ubuntu-x-swat/x-updates and it worked fine in the beginning and seemed to help with the problem, but now I'm seeing 60%+ CPU (of one of the two cores) being used in the Xorg process; laptop gets hot.

Nothing in xorg.log, nothing in .xsession-errors.

Loïc Minier (lool) wrote :

I stupidly tried stracing Xorg from a xterm on top of Xorg tsss :)

What's the proper way to look into the CPU consumption issue next time I hit it? (just froze my laptop and had to reboot because of the strace, so CPU is back to normal ATM)

Loïc Minier (lool) wrote :

BTW otherwise things seemed to be stable WRT memory consumption, but only ran it for some hours, might not be enough.

Just as Feedback: with the xup2 packages I have no memory leak. Compiz Performance is OK, before using the Packages from the PPA I had a choppy Performance while watching Videos in Fullscreen. This has gone with the updated Packages and runs smooth now.

Jcink (jcink2k) wrote :

For whatever it's worth, even if nothing at all, I've had the test setup in VMware Workstation with 3D acceleration turned on, host card is an ATI Radeon HD 4850. It seems okay, I let it go for around the past 15 hours and memory usage is fine. I realize VMs do their own things sometimes though. I'll keep testing it anyway to see if I can make anything happen.

Sebastian Martinez (tychocity) wrote :

I test this package now

Linuxexperte (andrea-koeth) wrote :

hello people,
I just heard of this today. So I just want to know if I am also affected or not.
I put this command into the Terminal: cat /sys/kernel/debug/dri/0/gem_objects

The output is this:
cat /sys/kernel/debug/dri/0/gem_objects
1688 objects
241090560 object bytes
3 pinned
13705216 pin bytes
118263808 gtt bytes
260308992 gtt total
andrea@andrea-desktop:~$

So my question would be: am I affected too or not??
My graphic-chip is: Intel Corporation 82945G/GZ Integrated Graphics Controller (rev 02)

Can anybody give me an answer and if it would ben ecessary for me to install these new packages??

Greetings
Karmicbastler

Martin Pitt (pitti) wrote :

Linuxexperte [2010-04-22 15:38 -0000]:
> 241090560 object bytes

This is quite much, but in the end it's really noticeable if you are
affected -- after a few hours your system becomes totally sluggish and
feels like a tar pit. On my system, shutdown took about a minute, too.

> Can anybody give me an answer and if it would ben ecessary for me to
> install these new packages??

If for nothing else, it's interesting to know that they do not cause
regressions.

Conn O Griofa (psyke83) wrote :

Martin,

Although the initial feedback from most people seems to indicate that the proposed update fixes the issue, I haven't seen anybody post output that would positively confirm the leak as being fixed (in other words, that the GEM object byte allocation is actually reducing when applications are closed).

Is it possible that the leak is still present with the patched server? The reason why I'm asking is that my system has just 768MB of RAM, and depending on my usage pattern it can take several hours for me to recognize the sluggishness (typically when the GEM allocation reaches ~300MB or so). Users with larger amounts of RAM may not notice any problems until much later, perhaps.

Robert Hooker (sarvatt) wrote :

Linuxexperte [2010-04-22 15:38 -0000]:
> 241090560 object bytes

229MB is a perfectly reasonable amount to expect there and not indicative of a leak, if on the other hand you see it at 1GB+ after a few more hours uptime then you know you have problems.

As a side note, the x-updates packages are fixing another major issue where clutter apps are failing to load under swrast [1] which is affecting a large amount of people and to me reinforces the assertion that dropping the GLX 1.4 backports is the correct thing to do at this point. Note that all KMS drivers are already able to have a client GLX version 1.4 regardless of this, this is just lowering the server's reported GLX version back to 1.2. Proprietary drivers are unaffected because they don't use the server's glx anyway.

[1] https://bugs.edge.launchpad.net/ubuntu/+source/gnome-games/+bug/561734

Petar Velkovski (pvelkovski) wrote :

Conn O Griofa,

petar@aurora:~$ cat /sys/kernel/debug/dri/0/gem_objects | grep 'object bytes'
160043008 object bytes
petar@aurora:~$ cat /sys/kernel/debug/dri/0/gem_objects | grep 'object bytes'
160456704 object bytes
petar@aurora:~$ oowriter &
[1] 21973
petar@aurora:~$ cat /sys/kernel/debug/dri/0/gem_objects | grep 'object bytes'
144465920 object bytes
petar@aurora:~$ oo
oobase oocalc oodraw ooffice oofromtemplate ooimpress oomath ooweb oowriter
petar@aurora:~$ oocalc &
[1] 22013
petar@aurora:~$ cat /sys/kernel/debug/dri/0/gem_objects | grep 'object bytes'
179482624 object bytes
[1]+ Done oocalc
petar@aurora:~$ cat /sys/kernel/debug/dri/0/gem_objects | grep 'object bytes'
144240640 object bytes

Does this positively confirm the leak as being fixed?

Petar Velkovski (pvelkovski) wrote :

Here is anothe one that might be more clear:

petar@aurora:~$ date ; cat /sys/kernel/debug/dri/0/gem_objects | grep 'object bytes'
Thu Apr 22 20:08:31 CEST 2010
164847616 object bytes
petar@aurora:~$ oowriter &
[1] 2189
petar@aurora:~$ date ; cat /sys/kernel/debug/dri/0/gem_objects | grep 'object bytes'
Thu Apr 22 20:08:44 CEST 2010
165933056 object bytes
petar@aurora:~$ oocalc &
[2] 2216
[1] Done oowriter
petar@aurora:~$ date ; cat /sys/kernel/debug/dri/0/gem_objects | grep 'object bytes'
Thu Apr 22 20:09:06 CEST 2010
[2]+ Done oocalc
201986048 object bytes
petar@aurora:~$ date ; cat /sys/kernel/debug/dri/0/gem_objects | grep 'object bytes'
Thu Apr 22 20:09:17 CEST 2010
187908096 object bytes
petar@aurora:~$ date ; cat /sys/kernel/debug/dri/0/gem_objects | grep 'object bytes'
Thu Apr 22 20:09:30 CEST 2010
192741376 object bytes
petar@aurora:~$ date ; cat /sys/kernel/debug/dri/0/gem_objects | grep 'object bytes'
Thu Apr 22 20:09:36 CEST 2010
174059520 object bytes

So the values are:
164847616
165933056 (encrease)
201986048 (encrease)
187908096 (decrease)
192741376 (encrease)
174059520 (decrease)

Good enough?

Bryce Harrington (bryce) wrote :

Bug https://bugs.launchpad.net/bugs/565903 likewise suggests a downgrade to glx 1.2 would resolve some corruption issues.

amano (jyaku) wrote :

Well, downgrading to glx 1.2 seems to be the best idea then. Only very few apps seem to be affected and people that use binary blobs are not affected at all. On the other hand the downgrade fixes some corruption issues.

Fabio Albieri (chareos) wrote :

I see the same behaviour in Kubuntu with KWIN effects enables.

Question:ù
in an eventual downgrade to glx 1.2 and such, how much performance loss shall we expect on lower end intel GMA netbooks ?

Robert Hooker (sarvatt) wrote :

>Question:ù
>in an eventual downgrade to glx 1.2 and such, how much performance loss shall we expect on lower end intel GMA netbooks ?

None

Michał Gołębiowski (mgol) wrote :

I observe the same behaviour in Karmic:

$ for t in `seq 1 10`; do eog /usr/share/backgrounds ; echo `grep "object bytes" /sys/kernel/debug/dri/0/gem_objects` `ps --noheaders ocomm,vsz,rss $(pidof X)`; done
374431744 object bytes Xorg 250080 96968 Xorg 163408 15644
393375744 object bytes Xorg 250192 97144 Xorg 163408 15644
398163968 object bytes Xorg 250208 97160 Xorg 163408 15644
402931712 object bytes Xorg 250228 97180 Xorg 163408 15644
407707648 object bytes Xorg 250264 97288 Xorg 163408 15644
412442624 object bytes Xorg 250264 97288 Xorg 163408 15644
417333248 object bytes Xorg 250280 97296 Xorg 163408 15644
421969920 object bytes Xorg 250308 97364 Xorg 163408 15644
426786816 object bytes Xorg 250720 97576 Xorg 163408 15644
431591424 object bytes Xorg 250720 97576 Xorg 163408 15644

Michał Gołębiowski (mgol) wrote :

I forgot to mention: I have Compiz enabled.

description: updated
Petar Velkovski (pvelkovski) wrote :

Michał Gołębiowski the test you've done by itself shows nothing. Please read my previous posts!

Conn O Griofa (psyke83) wrote :

Petar Volkovski,

What you did in comment #51 is just another variation of the testcase in the bug description that others have been performing (and that you insist does not represent a memory leak).

Your output differs from mine in that the GEM allocation increases *and* decreases. I have never seen the GEM object bytes allocation decrease on my system (as I said, even when I close every application and leave nothing but an empty GNOME desktop and panel running); only stopping/restarting the server will reduce the allocation.

I can understand that the object bytes may level out at a certain size, but on my system it grows out of control. Once it hits ~300MB (of a system with the integrated graphics set to 128MB, and with just 768MB ram in total), the system becomes unusable due to I/O and swapping. It is clearly a memory leak.

Robert Hooker (sarvatt) on 2010-04-23
Changed in xorg-server (Ubuntu Lucid):
status: Confirmed → Fix Committed
Bilal Akhtar (bilalakhtar) wrote :

Is this bug fixed? I am waiting for this package to come in the repos.

Tomáš Myšík (gapon) on 2010-04-23
Changed in xorg-server (Ubuntu Lucid):
status: Fix Committed → Fix Released
status: Fix Released → Fix Committed

Works for me(tm)

00:02.0 VGA compatible controller: Intel Corporation Mobile 945GME Express Integrated Graphics Controller (rev 03)

Also fixes netbook-launcher startup

Petar Velkovski (pvelkovski) wrote :

Conn O Griofa, I never said that the patch works at your computer.

Read what you've written: "Although the initial feedback from most people seems to indicate that the proposed update fixes the issue, I haven't seen anybody post output that would positively confirm the leak as being fixed"

I just posted what you demanded.

fgh (ghfan) on 2010-04-23
Changed in xorg-server (Ubuntu Lucid):
status: Fix Committed → Fix Released
status: Fix Released → Fix Committed
alex (alex-monika) wrote :

I've updated a couple of hours ago. This bug has gone, but now xorg sometimes crashes and gets me back to the kdm login screen.It hasn't been so before.

Petar Velkovski (pvelkovski) wrote :

alex you should open a new bug report (even if you believe that it is the patch mentioned here that caused your problem).
Open a console and type this:
ubuntu-bug xserver-xorg

There are certain attachments you should send when reporting problems like this (this list is taken form http://fedoraproject.org/wiki/Xorg/Debugging):

    * All of the X server log file(s): /var/log/Xorg.*.log
    * If you use a xorg.conf, please include it in the bug report, otherwise, please specify in the bug report that it does not exist. Usually this would be located at /etc/X11/xorg.conf, but see the xorg.conf manpage - man xorg.conf - for other standard locations.
    * /var/log/Xorg.0.log from a trial run where you move your xorg.conf aside and let Xorg autodetect your hardware (if you have such a file).
    * /var/log/dmesg (please add drm.debug=15 as boot parameter and reboot) especially in case of crashes and using KMS.
    * content of /var/log/gdm/ only in cases there is nothing interesting in /var/log/Xorg.*.log and dmesg output

but I believe that the command above will do that for you automatically.

You can also give some additional information like if you noticed that the crashes happen when you use certain applications. I hope this information will be of any use to you.

Petar Velkovski (pvelkovski) wrote :

alex I forgot to mention this in my previous post, but you should probably mention this bug report in your newly created bug report and also come back here and tell us the bug number of the report, in case someone else following this bug has the same problems as you do.

I have dual core processor. CPU1 and CPU2. I always get 100% CPU usage in either of the CPU usage but not at the same time. When CPU1 goes to 100%, CPU2 lowers to about 44%. then when CPU2 rises to 100%, CPU1 lowers down to about 47%.

Is this related also to this Xorg bug?

NVIDIA non-free is enabled in this PC. Visual effects is set to NONE but it still consumes 100% cpu usage in either of the 2 cpu cores.

Conn O Griofa (psyke83) wrote :

Petar,

You clearly implied that I was misinterpreting my results in the first sentence of comment #37. In comment #59, I acknowledged that your system appears not to have the same memory leak issue; as I said, your allocation increases /and/ decreases, unlike my own. I asked for evidence simply because I was concerned that some people may have prematurely reported the bug as fixed (thankfully however, it seems that my issue is a corner-case). I am not interested in an argument, so don't try to create a mountain out of a molehill.

Regardless, in light of the 1.8 server being uploaded to xorg-edgers (version 2:1.8.0+git20100422+server-1.8-branch.5455df65-0ubuntu0sarvatt2), I gave the newest packages another try.

Unfortunately, compiz stilI freezes on my system with the 1.8 server. Although I'm not sure if it matters, considering that the leak in this report is related to GLX/OpenGL, I did enable metacity compositing (which uses XRender), and after 5 hours uptime I can see the "object bytes" allocation remaining stable at ~43MB. The system is very responsive, and closing applications has the expected effect of decreasing the GEM allocation.

Petar Velkovski (pvelkovski) wrote :

Is there any reason for xserver-xorg not being updated in the official repository, or is it that the mirror server I'm getting updates from hasn't been updated yet?

Linuxexperte (andrea-koeth) wrote :

hi people,
I wanted to give you an update on my terminal-Output some minutes ago.

Hier is the Output:
andrea@andrea-desktop:~$ cat /sys/kernel/debug/dri/0/gem_objects | grep 'object bytes'
458080256 object bytes
andrea@andrea-desktop:~$ cat /sys/kernel/debug/dri/0/gem_objects
2037 objects
458878976 object bytes
3 pinned
13705216 pin bytes
165658624 gtt bytes
260308992 gtt total
andrea@andrea-desktop:~$

this shows, that figures are still rising somehow. I have already updated my system to the latest stands.
But what wonders me is, that my system is not going to freez even after about three or four hours of uptime. Systemspeed only goes down noticable, whan I use for example Opera, Gimp and a Gaming-Application at the same time.
RAM on my System is 1015MB and the System uses at the moment 5844MB
CPU is 2x Genuine Intel(R) CPU@1.6GHz
Greetings
Linuxexperte

Petar Velkovski (pvelkovski) wrote :

Linuxexperte is this with xserver-xorg packages from the official Lucid repository or with xserver-xorg packages from https://edge.launchpad.net/~ubuntu-x-swat/+archive/x-updates?

Alexander Bürger (acfb) wrote :

While reading this bug report I ran looked at /sys/kernel/debug/dri/0/gem_objects on my own computer. I got:

7681 objects
-1325842432 object bytes
6 pinned
12771328 pin bytes
108036096 gtt bytes
201326592 gtt total

I do not believe that the negative number is correct, but I am unable to tell if this is only a display problem.

Some info about my computer:
uname -a
Linux xenon 2.6.32-21-generic #32-Ubuntu SMP Fri Apr 16 08:09:38 UTC 2010 x86_64 GNU/Linux

dpkg -l xserver-xorg* | grep ^.[ih]
ii xserver-xorg 1:7.5+5ubuntu1 the X.Org X server
ii xserver-xorg-core 2:1.7.6-2ubuntu5 Xorg X server - core server
ii xserver-xorg-input-evdev 1:2.3.2-5ubuntu1 X.Org X server -- evdev input driver
ii xserver-xorg-input-mouse 1:1.5.0-1 X.Org X server -- mouse input driver
ii xserver-xorg-input-synaptics 1.2.2-1ubuntu4 Synaptics TouchPad driver for X.Org server
ii xserver-xorg-video-intel 2:2.9.1-3ubuntu4 X.Org X server -- Intel i8xx, i9xx display d
ii xserver-xorg-video-vesa 1:2.3.0-1ubuntu1 X.Org X server -- VESA display driver

lspci | grep Graph
00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07)
00:02.1 Display controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07)

uptime (includes 3-4 times hibernating)
23:16:58 up 1 day, 13:55, 4 users, load average: 0.36, 0.44, 0.36

system was installed from the beta2 iso and updated afterwards

Petar Velkovski (pvelkovski) wrote :

Alexander Bürger you are hit by the bug. The number is negative because "object bytes" uses signed integer to store its value (I came to this conclusion on my own, so sorry if that is not correct). And because there is a leak, and the number never decreases, it reaches a point when the number overflows (the maximum signed 32 integer value is +2,147,483,647). It seams that the patch proposed here is not present in the official Ubuntu repositories yet. You should add the PPA form https://edge.launchpad.net/~ubuntu-x-swat/+archive/x-updates and test if it works for you.

Robert Hooker (sarvatt) wrote :

The fix is released now, thanks for all of the testing everyone and if you still have problems please file a new bug about it with "ubuntu-bug xorg" so your logs can be examined deeper.

xorg-server (2:1.7.6-2ubuntu7) lucid; urgency=low

  * Drop 117_fix_crash_with_createglyphset.patch
    - Dupe of patch 110
  * Drop 03_fedora_glx_versioning.diff, 04_fedora_glx14-swrast.diff
    - These patches were brought in by Debian to provide glx 1.4 support
      which Fedora backported from xserver 1.8, however testing in
      Ubuntu showed they caused a crash when closing Clutter apps (#550218),
      and graphics corruption when opening windows. Dropping these patches
      returns us to GLX 1.2, which has been found to be stable; Debian has
      also dropped these two patches.
      (Fixes #565903).
  * Drop 114_dri2_make_sure_x_drawable_exists.patch
    - This was an early attempt by upstream which fixed the aforementioned
      Clutter crash, but which introduced a memory leak.
      (Fixes #565981)

Date: Thu, 22 Apr 2010 17:24:38 -0700

Changed in xorg-server (Ubuntu Lucid):
status: Fix Committed → Fix Released
fgh (ghfan) on 2010-04-24
Changed in xorg-server (Ubuntu Lucid):
status: Fix Released → Fix Committed
Robert Hooker (sarvatt) on 2010-04-24
Changed in xorg-server (Ubuntu Lucid):
status: Fix Committed → Fix Released
VasiaUVI (vasiauvi) wrote :

Hello,
I think I am also affected by this bug although I have GLX ver 1.2. Here are some outputs:

vasia@vasia-laptop:~$ grep "object bytes" /sys/kernel/debug/dri/0/gem_objects
78479360 object bytes
vasia@vasia-laptop:~$ glxinfo | grep "GLX version"
GLX version: 1.2
vasia@vasia-laptop:~$ uname -a
Linux vasia-laptop 2.6.32-21-generic-pae #32-Ubuntu SMP Fri Apr 16 09:39:35 UTC 2010 i686 GNU/Linux
vasia@vasia-laptop:~$ cat /sys/kernel/debug/dri/0/gem_objects
984 objects
79396864 object bytes
0 pinned
0 pin bytes
0 gtt bytes
0 gtt total
vasia@vasia-laptop:~$ lspci | grep VGA
01:05.0 VGA compatible controller: ATI Technologies Inc RC410 [Radeon Xpress 200M]

The think is that I don't have to wait 1 hour or so to have my system very heavy, after entering in Ubuntu Lucid the system is heavy!
1
What could be the problem? My Ubuntu system now is ...I can say, useless!
Thanks!

Philip Muškovac (yofel) wrote :

VasiaUVI: if you have glx 1.2 then you are NOT affected by this bug, please file a new bug with 'ubuntu-bug xorg'

yoda2031 (yoda2031-hotmail) wrote :

chris@chris-desktop:~$ glxinfo | grep "glx version"
server glx version string: 1.4
client glx version string: 1.4
chris@chris-desktop:~$ X -version

X.Org X Server 1.7.6
Release Date: 2010-03-17
X Protocol Version 11, Revision 0
Build Operating System: Linux 2.6.24-25-server i686 Ubuntu
Current Operating System: Linux chris-desktop 2.6.32-21-generic #32-Ubuntu SMP Fri Apr 16 08:10:02 UTC 2010 i686
Kernel command line: BOOT_IMAGE=/boot/vmlinuz-2.6.32-21-generic root=UUID=31bc6c69-2094-4837-89bd-fc9c6210c794 ro quiet splash
Build Date: 23 April 2010 05:11:50PM
xorg-server 2:1.7.6-2ubuntu7 (Bryce Harrington <email address hidden>)
Current version of pixman: 0.16.4
 Before reporting problems, check http://wiki.x.org
 to make sure that you have the latest version.
chris@chris-desktop:~$ free
             total used free shared buffers cached
Mem: 1801340 1759156 42184 0 8848 1337520
-/+ buffers/cache: 412788 1388552
Swap: 4883720 4268 4879452

Is it just me, or does something not add up there? (using the 'fixed' package, yet still using GLX 1.4, and RAM usage is still unearthly high)

Using nvidia non-free driver version 195

Changed in xorg-server (Ubuntu Lucid):
status: Fix Released → Fix Committed
status: Fix Committed → Fix Released
James King (jlking3) on 2010-04-26
Changed in xorg-server (Ubuntu Lucid):
status: Fix Released → Fix Committed
status: Fix Committed → Fix Released
madbiologist (me-again) wrote :

Now that this bug has been tracked down and corrected, can we consider reinstating the performance enhancement patches that were dropped in xserver-xorg-video-ati 1:6.13.0-1ubuntu5 as an attempt to fix this bug? Or do they depend on GLX 1.4?

See https://launchpad.net/ubuntu/+source/xserver-xorg-video-ati and bug #564181 and bug #563400 for details.

Sorry! Newb misclick!

Changed in xorg-server (Ubuntu Lucid):
status: Fix Released → In Progress
status: In Progress → Fix Released
Cat (bower820) on 2010-04-28
Changed in xorg-server (Ubuntu Lucid):
status: Fix Released → Fix Committed
status: Fix Committed → Fix Released
eMcE (emce) wrote :

This bug is now officially fixed or not jet?

Bortnyák Roland (antivirtel) wrote :

I think not yet :S:S

David Hardstone (dhardstone) wrote :

Yes, this bug has been fixed.

Rakosi Alpar (ralpyka) wrote :

I installed the final version of Lucid. I have 1gb of ram, the normal memory usage is about 30%, but after 2 hours now is 60%. Everything is closed, if I view the top applications, isn't anything that could be a reason for this memory usage.

direct rendering: Yes
server glx vendor string: SGI
server glx version string: 1.2
GLX version: 1.2

cat /sys/kernel/debug/dri/0/gem_objects
1754 objects
572997632 object bytes
0 pinned
0 pin bytes
0 gtt bytes
0 gtt total

My graphics card is an ATI9250, and I use the open source ati driver.
Now then my problem is related to this bug? Is this bug fixed?

Haggai Eran (haggai-eran) wrote :

Hi,
I'm using the final version of Lucid on an Asus Eee PC 1005HA with intel 945 graphics card. I don't usually use compiz, but I think I've encountered a similar situation to what this bug describes with vinagre. After using vinagre to access a vnc server for a couple of hours, the system really slows down. I get the following from /sys/kernel/debug/dri/0/gem_objects:
867 objects
2117353472 object bytes
3 pinned
36143104 pin bytes
185061376 gtt bytes
260308992 gtt total

If I close vinagre, it immediately drops down to:
572 objects
136286208 object bytes
3 pinned
36143104 pin bytes
87183360 gtt bytes
260308992 gtt total

$ glxinfo | grep "GLX version"
GLX version: 1.2

This particular bug is *definitely* fixed. It's entirely possible that
there are other, unrelated, memory issues in the server, or specific
drivers. These should be filed as separate bugs - although memory
issues often end up being problems in the apps using X :).

> If I close vinagre, it immediately drops down to:
...
If the “object bytes” count drops after closing applications then you
are not seeing this bug.

In fact, if you're using the Ubuntu 10.04 final release, you're not
seeing this bug. We know the cause, and the code responsible is no
longer in our X server.

Roberto Gordo Saez (rgs) wrote :

The "object bytes" count does not drop for me, it keeps growing and growing on fully updated Ubuntu 10.04. Closing applications never drops the count. Maybe not exactly the same bug, but the symptomps looks exactly like the original.

The 2GB of RAM and most swap gets full at around 5-7 days of uptime, swapping becomes extreme and must be rebooted. I have ATI radeon RV280, free driver, and KMS enabled (xorg crash when KMS is disabled, different bug).
GLX version: 1.2

Can I do something to debug this?

please fill a new bug!

Roberto Gordo Saez (rgs) wrote :

It is already reported and marked as a duplicate of this bug

rafael.pino (rafael-pino) wrote :

Greetings to everyone. I still have the memory leak problem, but only when compiz is on. I have a ATI Raden HD 3200 in a HP dv5-1022la. If compiz is off then memory usage is around 300MB, but when compiz is on, memory usage increments to 900MB.

David Tombs (dgtombs) wrote :

Rafael: As you can see above, this bug has been fixed. Please report a new bug.

Matt (mhhennig) wrote :

This still seems to be a problem in recent kernel versions (KMS enabled)!

Oddly, with kernel 2.6.33, everything seems fine (lucid + ppa kde and all xorg variants, including xorg-edgers on a Thinkpad T500, i915).

But with every version from 2.6.34 up to 2.6.36-rc4 I have tried, there clearly is a bad memory leak making the machine unusable within several hours (watching movies does is not helpful in particular).

This slabtop output shows what happens after less than two hours or so of normal activity (switching desktops, moving windows etc.):

 OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
211200 211200 100% 0.02K 825 256 3300K kmalloc-16

It'd be great if anyone who knows more about the state of this problem could comment or post a solution if available...

FAJALOU (fajalou) wrote :

I second Matt...but with a few differences.
I am running Linux lrc-laptop 2.6.32-26-generic #48-Ubuntu SMP Wed Nov 24 09:00:03 UTC 2010 i686 GNU/Linux
Lenovo thinkpad sl410, with an intel graphics card.

After a few suspends, /usr/bin/X :0 vt7 -nr -nolisten tcp -auth /var/run/xauth/A:0-26HBac will begin to just eat up my memory, bringing my computer to a standstill in the end and forcing a reboot.

FAJALOU (fajalou) wrote :

Any thoughts? Help?

Matt (mhhennig) wrote :

As far as I can see, this happens only for certain kernel configurations. I am currently running a custom 2.6.36, and everything seems fine. But a while ago I when did some experiments with kernel options, I found cases where the leak was there. I did not report this because I could not find out what caused the problem.

madbiologist (me-again) wrote :

@Rakosi, Roberto, and FAJALOU - if you are using a laptop you may be encountering bug #569273.

Kaushik (c-a-subramaniam) wrote :

Should I be seeing this issue now in oneric?
I have filed a bug here and want to check if this is a duplicate.
https://bugs.launchpad.net/ubuntu/+bug/985169

BR
Subbu

Kaushik (c-a-subramaniam) wrote :

I had rebooted my desktop since the time that I posted the last log:
The latest values are:

glxinfo | grep "GLX version"
GLX version: 1.4

cat /sys/kernel/debug/dri/0/i915_gem_objects
655 objects, 657907712 bytes
326 [306] objects, 96661504 [73461760] bytes in gtt
  15 [12] active objects, 17940480 [6397952] bytes
  8 [8] pinned objects, 5607424 [5607424] bytes
  303 [286] inactive objects, 73113600 [61456384] bytes
  0 [0] freed objects, 0 [0] bytes
9 pinned mappable objects, 11898880 bytes
176 fault mappable objects, 7675904 bytes
2147479552 [268435456] gtt total

BR
Subbu

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.