[Problem]
Memory leak. Fix to glx 1.4 backport did not deallocate gem objects properly.
[Background]
Red Hat backported glx 1.3 and 1.4 support from xserver 1.8. These patches were taken by Debian as patches 03_fedora_glx_versioning.diff and 04_fedora_glx14-swrast.diff, and so Ubuntu took them in order to remain in sync with Debian. Other distros using xserver 1.7 have likewise adopted these backports.
Subsequent testing by Ubuntu identified an xserver crash that occurs with these patches enabled when closing Clutter apps. A partial fix was implemented in Ubuntu based on upstream work, and the issue believed solved, but further testing has shown that a slow memory leak is present, causing issues such as described below, which can result in system instability after a day or two of uptime (depending on memory quantity and usage). Distros that don't include support for Clutter obviously won't see the bugs.
Following these findings, Debian has dropped the glx patches. Ubuntu is evaluating fixing the patches vs. following Debian's approach, being mindful of any userspace apps that may have come to depend on glx 1.3/1.4 functionality.
[Original Report]
There has been some buzz the last days about excessive swapping and OOM conditions. It can seem like the kernel memory use is increasing since the user processes seem not to grow unusually.
/sys/kernel/debug/dri/0/gem_objects shows that the GEM object bytes number is increasing. One way to reproduce, is this:
$ for t in `seq 1 10`; do eog /usr/share/backgrounds ; echo `grep "object bytes" /sys/kernel/debug/dri/0/gem_objects` `ps --noheaders ocomm,vsz,rss $(pidof X)`; done
142376960 object bytes Xorg 25812 15372
145907712 object bytes Xorg 25812 15372
150458368 object bytes Xorg 25812 15372
154816512 object bytes Xorg 25812 15372
159244288 object bytes Xorg 25812 15372
163721216 object bytes Xorg 25812 15372
168148992 object bytes Xorg 25812 15372
172699648 object bytes Xorg 25812 15372
177152000 object bytes Xorg 25812 15372
181530624 object bytes Xorg 25812 15372
It shows that the Xorg process is not growing, but gem objects are. Similarly counting and summing objects show there are gem objects adding up (with refcount 2) but not disappearing again when the application closes:
awk '/name/{ i++; s+= $4 } END{print i " " s}' /sys/kernel/debug/dri/0/gem_names
These issues have been seen on intel and ati, with the lucid kernel as well as the mainline 2.6.34 snapshot.
ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: xorg 1:7.5+5ubuntu1
Uname: Linux 2.6.34-999-generic i686
Architecture: i386
Date: Sun Apr 18 15:59:21 2010
ProcEnviron:
LANG=en_US.UTF-8
SHELL=/bin/bash
SourcePackage: xorg
system:
distro: Ubuntu
codename: lucid
architecture: i686
kernel: 2.6.34-999-generic
Restarting Xorg brings the gem object bytes down again, so I am not sure xorg-server (or driver) is not to blame. debug/dri/ 0/gem_objects
$ cat /sys/kernel/
156 objects
54063104 object bytes