[fglrx] Using fglrx causes excessive hardware interrupts and an extremely slow system

Bug #206337 reported by Jeff Balderson
8
Affects Status Importance Assigned to Milestone
fglrx-installer (Ubuntu)
Fix Released
High
Unassigned

Bug Description

On an up-to-date Hardy install, when I enable the FGLRX driver using the restricted drivers manager and reboot, either X or FGLRX cause an excessive number of hardware interrupts on my system whenever it's the active console:

jb2@loki:~$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r b swpd free buff cache si so bi bo in cs us sy id wa
 4 0 0 572108 10032 227532 0 0 155 30 22797 126 95 3 1 1
 1 0 0 572100 10032 227532 0 0 0 0 2 81 100 0 0 0
 1 0 0 572100 10032 227532 0 0 0 0 24 146 100 0 0 0
 1 0 0 572100 10040 227532 0 0 0 16 7 92 100 0 0 0
 1 0 0 572100 10040 227532 0 0 0 0 25 145 100 0 0 0
 1 0 0 572100 10040 227532 0 0 0 0 55805 247 98 2 0 0
 2 0 0 572100 10040 227532 0 0 0 0 152106 90 100 0 0 0
 3 0 0 572100 10040 227532 0 0 0 0 147587 122 100 0 0 0
 1 0 0 572100 10040 227532 0 0 0 0 151499 176 99 1 0 0
 1 0 0 572100 10040 227532 0 0 0 0 152357 25 100 0 0 0
 1 0 0 572100 10040 227532 0 0 0 0 152022 81 100 0 0 0
 1 0 0 572100 10052 227532 0 0 0 40 152170 107 100 0 0 0
 1 0 0 572100 10052 227532 0 0 0 0 80430 199 100 0 0 0
 1 0 0 572100 10052 227532 0 0 0 0 1 79 100 0 0 0
 1 0 0 572100 10052 227532 0 0 0 0 24 131 100 0 0 0
 1 0 0 572100 10052 227532 0 0 0 0 1 77 100 0 0 0

Around second 6, I switched from tty1 to tty7 (X server), and back to tty1 around second 13.

You can see the number of hardware interrupts jump from around 100/s to 150000/s whenever X is on the active console. Performance is also very noticeably degraded. I've compared this against another system with a slightly older, but still respectable 3D card and the "in" field never exceeds 1000. This system.

The VESA driver doesn't experience this problem, but has other significant issues (like the color map gets all messed up when you switch to a text console and back).

I've tried various options (noapic, pci=routeirq, irqpoll and others) without any improvement to the situation and with some the system locks up hard and needs to be hard-reset.

I've also seen the same problem with this exact same system using Gutsy and Envy-installed FGLRX drivers.

Revision history for this message
Jeff Balderson (jbalders) wrote :
Revision history for this message
Jeff Balderson (jbalders) wrote :
Revision history for this message
Jeff Balderson (jbalders) wrote :
Revision history for this message
Jeff Balderson (jbalders) wrote :
Revision history for this message
Jeff Balderson (jbalders) wrote :
Revision history for this message
Jeff Balderson (jbalders) wrote :
Revision history for this message
Jeff Balderson (jbalders) wrote :

I should add that it appears that the system referenced above is otherwise functional. Top reports the system is constantly 60% busy in "%hi" when X is on the active console. The remainder of the system feels like it's 60% or busier, like it's running a 500mhz Celeron instead of the actual 2.8ghz Celeron D. Compiz works and 3D performance is fair, but not as fast as I'd expect, and definitely slower than the Nvidia TI4400 I have in a similarly equipped machine.

Bryce Harrington (bryce)
Changed in linux-restricted-modules-2.6.24:
importance: Undecided → High
status: New → Triaged
Revision history for this message
Bryce Harrington (bryce) wrote :

Hmm, it sounds like it's doing software rendering. Please try adding:

Section "ServerFlags"
Option "AIGLX" "off"
EndSection

Revision history for this message
Jeff Balderson (jbalders) wrote :
Download full text (5.4 KiB)

There's no change. If anything, my system seems less responsive than before. The excessive interrupts are still occurring. AIGLX was disabled, as /var/log/Xorg.?.log went from:

$ fgrep AIGLX /var/log/Xorg.1.log
(==) AIGLX enabled
(WW) AIGLX: 3D driver claims to not support visual 0x23
(WW) AIGLX: 3D driver claims to not support visual 0x24
(WW) AIGLX: 3D driver claims to not support visual 0x25
(WW) AIGLX: 3D driver claims to not support visual 0x26
(WW) AIGLX: 3D driver claims to not support visual 0x27
(WW) AIGLX: 3D driver claims to not support visual 0x28
(WW) AIGLX: 3D driver claims to not support visual 0x29
(WW) AIGLX: 3D driver claims to not support visual 0x2a
(WW) AIGLX: 3D driver claims to not support visual 0x2b
(WW) AIGLX: 3D driver claims to not support visual 0x2c
(WW) AIGLX: 3D driver claims to not support visual 0x2d
(WW) AIGLX: 3D driver claims to not support visual 0x2e
(WW) AIGLX: 3D driver claims to not support visual 0x2f
(WW) AIGLX: 3D driver claims to not support visual 0x30
(WW) AIGLX: 3D driver claims to not support visual 0x31
(WW) AIGLX: 3D driver claims to not support visual 0x32
(WW) AIGLX: 3D driver claims to not support visual 0x33
(WW) AIGLX: 3D driver claims to not support visual 0x34
(WW) AIGLX: 3D driver claims to not support visual 0x35
(WW) AIGLX: 3D driver claims to not support visual 0x36
(WW) AIGLX: 3D driver claims to not support visual 0x37
(WW) AIGLX: 3D driver claims to not support visual 0x38
(WW) AIGLX: 3D driver claims to not support visual 0x39
(WW) AIGLX: 3D driver claims to not support visual 0x3a
(WW) AIGLX: 3D driver claims to not support visual 0x3b
(WW) AIGLX: 3D driver claims to not support visual 0x3c
(WW) AIGLX: 3D driver claims to not support visual 0x3d
(WW) AIGLX: 3D driver claims to not support visual 0x3e
(WW) AIGLX: 3D driver claims to not support visual 0x3f
(WW) AIGLX: 3D driver claims to not support visual 0x40
(WW) AIGLX: 3D driver claims to not support visual 0x41
(WW) AIGLX: 3D driver claims to not support visual 0x42
(WW) AIGLX: 3D driver claims to not support visual 0x43
(WW) AIGLX: 3D driver claims to not support visual 0x44
(WW) AIGLX: 3D driver claims to not support visual 0x45
(WW) AIGLX: 3D driver claims to not support visual 0x46
(WW) AIGLX: 3D driver claims to not support visual 0x47
(WW) AIGLX: 3D driver claims to not support visual 0x48
(WW) AIGLX: 3D driver claims to not support visual 0x49
(WW) AIGLX: 3D driver claims to not support visual 0x4a
(WW) AIGLX: 3D driver claims to not support visual 0x4b
(WW) AIGLX: 3D driver claims to not support visual 0x4c
(WW) AIGLX: 3D driver claims to not support visual 0x4d
(WW) AIGLX: 3D driver claims to not support visual 0x4e
(WW) AIGLX: 3D driver claims to not support visual 0x4f
(WW) AIGLX: 3D driver claims to not support visual 0x50
(WW) AIGLX: 3D driver claims to not support visual 0x51
(WW) AIGLX: 3D driver claims to not support visual 0x52
(WW) AIGLX: 3D driver claims to not support visual 0x53
(WW) AIGLX: 3D driver claims to not support visual 0x54
(WW) AIGLX: 3D driver claims to not support visual 0x55
(WW) AIGLX: 3D driver claims to not support visual 0x56
(WW) AIGLX: 3D driver claims to not support visual 0x5...

Read more...

Revision history for this message
The Fiddler (stapostol) wrote :

Can't confirm on a X1950 Pro (PCI Express, 256MB), Core 2 E6750 running in x86_64 mode.

Interrupt counts look normal:

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r b swpd free buff cache si so bi bo in cs us sy id wa
 0 0 16 906716 1680108 2579516 0 0 14 36 14 306 1 1 97 0
 0 0 16 906488 1680108 2579512 0 0 0 0 19 325 1 0 99 0
 0 0 16 906504 1680108 2579512 0 0 0 0 0 257 0 1 99 0
 0 0 16 906504 1680108 2579512 0 0 0 0 15 295 0 0 100 0
 0 0 16 906504 1680108 2579512 0 0 0 0 0 282 0 0 100 0
 0 0 16 906504 1680120 2579500 0 0 0 216 31 312 0 0 100 0
 0 0 16 906504 1680120 2579512 0 0 0 0 0 253 0 1 99 0
 0 0 16 906504 1680120 2579512 0 0 0 0 15 295 0 0 100 0
 0 0 16 906504 1680120 2579512 0 0 0 0 0 259 1 0 98 0
 1 0 16 906504 1680120 2579512 0 0 0 0 16 320 2 0 98 0
 1 0 16 906504 1680120 2579512 0 0 0 0 0 268 2 0 98 0
 0 0 16 906504 1680120 2579512 0 0 0 0 16 307 2 0 99 0
 0 0 16 906504 1680120 2579512 0 0 0 0 0 263 0 0 100 0
 0 0 16 906504 1680120 2579512 0 0 0 0 17 316 0 0 100 0
 0 0 16 906504 1680120 2579512 0 0 0 0 6 238 1 1 99 0
 0 0 16 906504 1680120 2579512 0 0 0 0 20 313 0 0 100 0
 0 0 16 906504 1680120 2579512 0 0 0 0 1 274 1 1 99 0
 0 0 16 906504 1680128 2579504 0 0 0 12 23 290 0 0 100 0
 0 0 16 906504 1680128 2579512 0 0 0 0 6 264 0 0 100 0
 0 0 16 906504 1680128 2579512 0 0 0 0 23 328 1 0 99 0
 0 0 16 906504 1680128 2579512 0 0 0 0 0 254 1 0 99 0

Revision history for this message
Jeff Balderson (jbalders) wrote :

It looks like that might not be a good comparison:

1) AGP vs PCIe
2) 32-bit vs 64-bit

Unless those two things matter less than they appear to.

Revision history for this message
allartk (allartk) wrote :

I have on an X1300 on a pci-e slot a similar / identical issue. I see the following repeating block in Xorg.0.log:

Received Interrupt event message:
... dwIRQSource: 20008000
... dwIRQCounter: 5
... dwIRQEnableId: 0000000000000008
... pvKernelEvent: (nil)
... pvEventHandle: (nil)
... dwContextData: 00000000
... ullStartTime: 3578707608l
... ullEndTime: 582l
Received Interrupt event message:
... dwIRQSource: 20008000
... dwIRQCounter: 6
... dwIRQEnableId: 0000000000000008
... pvKernelEvent: (nil)
... pvEventHandle: (nil)
... dwContextData: 00000000
... ullStartTime: 4131610152l
... ullEndTime: 2110022l

Each time this message appears I notice a short lock of the screen and a cpu peak. this message repeats itselfs very often. The cpu peak is only if atieventsd is running.

Revision history for this message
allartk (allartk) wrote :

After disabling the internal video card in my bios it seems to be solved. Although this device didn't had any output, and my pci-e was my first device, somehow they were interfering.

Revision history for this message
allartk (allartk) wrote :

hmm to soon...

Revision history for this message
Jeff Balderson (jbalders) wrote :

My original system was a 32-bit Hardy running on 64-bit capable hardware, upgraded from Gutsy.

I did a fresh 32-bit install of Hardy, fully updated -- no change.
I did a fresh 64-bit install of Hardy, fully updated -- no change.

I read something somewhere recently about people running 32 bit OS on 64-bit hardware and running into some issues which is why I decided to do the fresh install.

The problem still remains without any notable change -- something is still generating about 150K interrupts per sec (based on output from 'vmstat 1'), normal system performance, 2D performance is excruciatingly slow, and 3D performance is fair but slower than expected. If I disable the FGLRX driver, interrupts, 2D performance and system performance go back to their expected levels, but obviously I lose any benefits of the direct rendering.

All this confirms is that I can replicate the results with both 32-bit and 64-bit Hardy and that there's no legacy issues from my Gutsy->Hardy upgrade that are triggering it.

Revision history for this message
Alberto Milone (albertomilone) wrote :

You might try the latest version of the fglrx driver and see if it fixes the problem.

Can you enable the hardy-proposed and hardy-updates repositories, install EnvyNG so as to install the latest release of the driver?

If you can still reproduce the problem then you should report the problem to ATI:
http://ati.cchtml.com/

Changed in linux-restricted-modules-2.6.24:
status: Triaged → Incomplete
Revision history for this message
Jeff Balderson (jbalders) wrote :

I tried this back around 7/15.

Unfortunately, the EnvyNG installer results in a broken installation. Based on my limited diagnosis, the kernel module won't load because there's no device stanza in xorg.conf which contains "fglrx". Because the kernel module isn't loaded, I wind up with a white screen when Compiz loads.

Once I rectified that situation (manually installing the module using insmod instead of modprobe), the original problem seems to still be unchanged. Since you have to specify the full path of the module when you use insmod, I'm fairly certain that I was using the correct envy-installed kernel module and not the normal "restricted" module.

Thanks for trying though.

Revision history for this message
Alberto Milone (albertomilone) wrote :

Jeff: what do you mean by "broken installation"?

Also, if you can reproduce the problem, can you attach your /var/log/Xorg.0.log and /var/log/Xorg.0.log.old

Revision history for this message
Jeff Balderson (jbalders) wrote :

Alberto,

In this case EnvyNG yields a broken installation because when Compiz loads (i.e., it uses 3D effects), you get a white screen instead of the expected 3D effects.

After some digging, I found this to be because the fglrx kernel module isn't loading. The reason the kernel module isn't loading is because there's no device stanza in xorg.conf which contains "fglrx". If I manually "insmod /lib/modules/2.6.24-19-generic/IForgetTheExactPpathToTheEnvyNGVersionOf/fglrx.ko", I get 3D effects as expected. Note that I was careful to make sure that I wasn't loading a remnant of the older "Hardware Drivers" installed which is normally included in volatile/fglrx.ko.

I don't have this same problem with the "Hardware Drivers" method of getting 3D effects.

Unfortunately, my original problem is experienced with either the EnvyNG installed driver (I forget the exact version that ultimately got installed) or the "Hardware Drivers" installed driver. If I can find some time soon, I'll report it on the ATI bugtracker.

Revision history for this message
Bryce Harrington (bryce) wrote :

Good day Jeff,

I've just uploaded a new 8.543 version of -fglrx to Intrepid which now should work with xserver 1.5. Would you mind testing this new version and reporting back whether this issue is still present? If it is, it would be helpful if you could (re-)post your Xorg.0.log from running with this version. Thanks ahead of time.

If you don't have Intrepid installed on your system, you can test this by booting an Intrepid LiveCD (available from cdimage.ubuntu.com), using either the -vesa or -ati driver, then update to the latest version of Ubuntu, install fglrx, and then logout and back in. Your /var/log/Xorg.0.log will confirm whether you've loaded FGLRX successfully.

If you find any new issues, please report them as separate bugs. You can use the tool `ubuntu-bug fglrx-installer` which will gather the necessary files and create the launchpad report for you to fill in more easily.

Revision history for this message
Jeff Balderson (jbalders) wrote :

Bryce,

Interestingly, I just upgraded to intrepid last night to get around the annoying intrepid->nx->hardy keyboard mapping bug (reported by someone else already), so I was in a good place to test this tonight.

The updates installed fine, the fglrx-installer (via System->Administration->Hardware Drivers) worked correctly and my original problem (150k interrupts per sec) appears to be resolved with these latest updates. Interrupts are now down in a range I consider to be quite sane when using the FGLRX driver.

X.org.log confirms that its using the 8.54.3 driver, and glxgears went from 190fps to 1200fps (even though glxgears is not to be used as a benchmark ;-).

Thanks for working on/fixing this!

Revision history for this message
Mario Limonciello (superm1) wrote :

Closing bug per comments by Jeff.

Changed in fglrx-installer:
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.