I've done some testing with OpenSUSE 11.1 Beta5 (the KDE4 LiveCD) and identified some performance issues. I haven't done comparisons with XAA yet, but instead compared performance against my standard development build. This currently consists of:
Linux 2.6.28-rc4 (from git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel for-review branch)
X server 1.5.99.1 (recent master)
xf86-video-intel 2.4.97 (recent master)
For testing I took x11perf with a selected set of tests (eliminating all of the uninteresting core rendering tests such as stippled fills, wide lines, ellipses, and arcs). I ran these tests against my "master" builds and then the same tests on the same hardware after booting the OpenSUSE live CD.
Finally, I manually scanned the results looking for cases where the performance differed by 2x or more. Below is a sorted list of the differences, showing the "master" performance followed by the OpenSUSE performance for each test. Also, for each test the relative performance is quantified (where "slowdown" means that the OpenSUSE performance is slower than the "master" performance---note that in two cases there is actually a speedup instead).
The next step would be to do profiling of some of the slowest tests, or perhaps to switch out one or more components to see what's contributing to the performance difference. Any contribution to those efforts from anyone would be most appreciated---as would any verification of these test results, or similar testing with XAA.
The copywinwin test is likely the most fundamental. And it perhaps is at the root of several of the other slowdowns.
Here's an x11perf command line that can be used to quickly obtain results for just these tests that seem interesting:
I've done some testing with OpenSUSE 11.1 Beta5 (the KDE4 LiveCD) and identified some performance issues. I haven't done comparisons with XAA yet, but instead compared performance against my standard development build. This currently consists of:
Linux 2.6.28-rc4 (from git://git. kernel. org/pub/ scm/linux/ kernel/ git/anholt/ drm-intel for-review branch)
X server 1.5.99.1 (recent master)
xf86-video-intel 2.4.97 (recent master)
For testing I took x11perf with a selected set of tests (eliminating all of the uninteresting core rendering tests such as stippled fills, wide lines, ellipses, and arcs). I ran these tests against my "master" builds and then the same tests on the same hardware after booting the OpenSUSE live CD.
Finally, I manually scanned the results looking for cases where the performance differed by 2x or more. Below is a sorted list of the differences, showing the "master" performance followed by the OpenSUSE performance for each test. Also, for each test the relative performance is quantified (where "slowdown" means that the OpenSUSE performance is slower than the "master" performance---note that in two cases there is actually a speedup instead).
The next step would be to do profiling of some of the slowest tests, or perhaps to switch out one or more components to see what's contributing to the performance difference. Any contribution to those efforts from anyone would be most appreciated---as would any verification of these test results, or similar testing with XAA.
The copywinwin test is likely the most fundamental. And it perhaps is at the root of several of the other slowdowns.
Here's an x11perf command line that can be used to quickly obtain results for just these tests that seem interesting:
x11perf -repeat 2 \
-aatrap1 -aatrap10 -aatrap2x1 -aatrap2x10 \
-aa10text -aa24text -rgb10text -rgb24text \
-scroll10 -scroll100 \
-copywinwin10 -copywinwin100 \
-copypixwin10 -copypixwin100 \
-putimage10 -putimage100 \
-shmput10 -shmput100 \
-getimage100 -getimage500 \
-compwinwin10 -compwinwin100 \
-comppixwin10 -comppixwin100
And here are the results I obtained:
-aatrap2x1: 316000.0/sec
70900. 0/sec 44.5x slowdown
-putimage10: 126000.0/sec
6280. 0/sec 20.1x slowdown
-copywinwin10: 137000.0/sec
7570. 0/sec 18.1x slowdown
-compwinwin10: 125000.0/sec
7520. 0/sec 16.6x slowdown
-comppixwin10: 124000.0/sec
8360. 0/sec 14.8x slowdown
-copypixwin10: 125000.0/sec
10000. 0/sec 12.5x slowdown
-scroll10: 139000.0/sec
11500. 0/sec 12.1x slowdown
-shmput10: 112000.0/sec
10700. 0/sec 10.5x slowdown
-getimage100: 1350.0/sec
6240. 0/sec 4.6x speedup (!)
-putimage100: 9420.0/sec
2260. 0/sec 4.2x slowdown
-aatrap1: 325000.0/sec
79700. 0/sec 4.1x slowdown
-aa24text: 57700.0/sec
14600. 0/sec 4.0x slowdown
-shmput100: 14900.0/sec
4270. 0/sec 3.5x slowdown
-aatrap10: 89800.0/sec
25400. 0/sec 3.5x slowdown
-copywinwin100: 19100.0/sec
5750. 0/sec 3.3x slowdown
-compwinwin100: 19100.0/sec
5730. 0/sec 3.3x slowdown
-aa10text: 85100.0/sec
26200. 0/sec 3.2x slowdown
-rgb10text: 76200.0/sec
24900. 0/sec 3.0x slowdown
-comppixwin100: 18200.0/sec
6280. 0/sec 2.9x slowdown
-scroll100: 19000.0/sec
6850. 0/sec 2.8x slowdown
-aatrap2x10: 97000.0/sec
35200. 0/sec 2.8x slowdown
-rgb24text: 45200.0/sec
16800. 0/sec 2.7x slowdown
-copypixwin100: 18400.0/sec
6750. 0/sec 2.7x slowdown
-shmput500: 1070.0/sec
420. 0/sec 2.5x slowdown
-putimage500: 394.0/sec
161. 0/sec 2.4x slowdown
-getimage500: 55.8/sec
106. 0/sec 1.9x speedup (!)