spam of change events from drm/card0

Bug #440411 reported by Lars Ljung
146
This bug affects 24 people
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Medium
linux (Ubuntu)
Won't Fix
Medium
Unassigned
Nominated for Karmic by Nicholas J Kreucher
Nominated for Lucid by Lars Ljung

Bug Description

Binary package hint: udev

One udevd process is constantly using 5-10% of the CPU. There is also 64 other udevd processes running.

I run strace on the process and it seems to be related to the graphics card (Intel G41)

ProblemType: Bug
Architecture: amd64
Date: Fri Oct 2 09:16:24 2009
DistroRelease: Ubuntu 9.10
MachineType: System manufacturer System Product Name
Package: udev 147~-5
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.31-11-generic root=UUID=61039ef4-e057-4b72-ac4d-9038fc6aa0cc ro crashkernel=384M-2G:64M,2G-:128M quiet splash
ProcEnviron:
 LANGUAGE=sv_SE.UTF-8
 PATH=(custom, user)
 LANG=sv_SE.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.31-11.36-generic
SourcePackage: udev
Uname: Linux 2.6.31-11-generic x86_64
dmi.bios.date: 09/08/2008
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 0310
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: P5QPL-VM
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: Rev X.0x
dmi.chassis.asset.tag: Asset-1234567890
dmi.chassis.type: 3
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0310:bd09/08/2008:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKComputerINC.:rnP5QPL-VM:rvrRevX.0x:cvnChassisManufacture:ct3:cvrChassisVersion:
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Created an attachment (id=29047)
Init outputs before IRQ status

Does this patch also work? In order to avoid the spurious interrupts we're supposed to initialize the PEG band gap to the correct voltage...

Revision history for this message
In , gneman (luis6674) wrote :
Download full text (3.8 KiB)

I just tested from latest linus' git and with this latest patch and I do see the problem still.

I had nothing in dmesg when the problem triggered since I didn't apply the debug patch provided some time ago. However, I did have an error that might or might not be related to the patch (I can just say that git from about a week ago didn't show this error):

[ 23.764976] kbuildsycoca4 used greatest stack depth: 6028 bytes left
[ 75.535185] [drm:i915_gem_execbuffer] *ERROR* Object f638f3c0 appears more than once in object list
[ 75.602287] [drm:i915_gem_execbuffer] *ERROR* Object f6160060 appears more than once in object list
[ 75.617678] [drm:i915_gem_execbuffer] *ERROR* Object f6160060 appears more than once in object list
[ 75.671562] [drm:i915_gem_execbuffer] *ERROR* Object f638f420 appears more than once in object list
[ 75.818337] [drm:i915_gem_execbuffer] *ERROR* Object f638f480 appears more than once in object list
[ 75.881663] [drm:i915_gem_execbuffer] *ERROR* Object f638f4e0 appears more than once in object list
[ 75.901713] [drm:i915_gem_execbuffer] *ERROR* Object f638f4e0 appears more than once in object list
[ 76.150701] [drm:i915_gem_execbuffer] *ERROR* Object f638f300 appears more than once in object list
[ 76.170042] [drm:i915_gem_execbuffer] *ERROR* Object f638f300 appears more than once in object list
[ 76.242343] [drm:i915_gem_execbuffer] *ERROR* Object f638f540 appears more than once in object list
[ 76.306192] [drm:i915_gem_execbuffer] *ERROR* Object f638f420 appears more than once in object list
[ 76.312373] [drm:i915_gem_execbuffer] *ERROR* Object f638f420 appears more than once in object list
[ 76.317394] [drm:i915_gem_execbuffer] *ERROR* Object f638f420 appears more than once in object list
[ 76.451448] [drm:i915_gem_execbuffer] *ERROR* Object f638f540 appears more than once in object list
[ 76.455672] [drm:i915_gem_execbuffer] *ERROR* Object f638f540 appears more than once in object list
[ 76.471548] [drm:i915_gem_execbuffer] *ERROR* Object f638f540 appears more than once in object list
[ 76.552370] [drm:i915_gem_execbuffer] *ERROR* Object f638f3c0 appears more than once in object list
[ 76.567279] [drm:i915_gem_execbuffer] *ERROR* Object f638f3c0 appears more than once in object list
[ 76.614682] [drm:i915_gem_execbuffer] *ERROR* Object f638f300 appears more than once in object list
[ 76.627649] [drm:i915_gem_execbuffer] *ERROR* Object f638f300 appears more than once in object list
[ 76.697832] [drm:i915_gem_execbuffer] *ERROR* Object f638f660 appears more than once in object list
[ 76.711030] [drm:i915_gem_execbuffer] *ERROR* Object f638f660 appears more than once in object list
[ 76.845556] [drm:i915_gem_execbuffer] *ERROR* Object f638f300 appears more than once in object list
[ 76.862304] [drm:i915_gem_execbuffer] *ERROR* Object f638f300 appears more than once in object list
[ 77.228811] [drm:i915_gem_execbuffer] *ERROR* Object f638f600 appears more than once in object list
[ 77.241402] [drm:i915_gem_execbuffer] *ERROR* Object f638f600 appears more than once in object list
[ 77.298431] [drm:i915_gem_execbuffer] *ERROR* Object f638f3c0 appears more than once in o...

Read more...

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Oh well, I guess it's not the PEG band voltage bug then... I'll ping the display guys and see what I can come up with.

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Created an attachment (id=29419)
another debug patch

Hopefully this one allows your monitor to come back?

Revision history for this message
In , gneman (luis6674) wrote :

Yes, this patch solves the problem. In fact, the second part of the patch is the same as the test patch that already proved to solve it some time ago, but it was not considered the right fix. I guess the first part of this last patch is what was missing to make the first one a real fix?

Thanks.

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

(In reply to comment #5)
> Yes, this patch solves the problem. In fact, the second part of the patch is
> the same as the test patch that already proved to solve it some time ago, but
> it was not considered the right fix. I guess the first part of this last patch
> is what was missing to make the first one a real fix?

No, this one still isn't right (chipset guys will get back to me soon I hope). It was a test patch for what sounds like a related issue; I've had one report that although the hack patch solves the stuck interrupt issue it also prevents monitors from syncing again when they're turned off and back on again (all the while attached). Pretty weird, but possibly related to the hotplug quirks on G45.

Revision history for this message
In , gneman (luis6674) wrote :

Ah, ok, I didn't know about that monitor problem and I really can't confirm if it solves that problem. I was just talking about the uevents thing in my previous reply.

Revision history for this message
In , gneman (luis6674) wrote :

I pulled from git today (the soon-to-be 2.6.32-rc1) and I can't reproduce the problem anymore. I'll check to make sure I haven't done anything wrong maybe with the .config (though DRI and everything is working fine), but it does look like it is fixed. I do see the [drm:i915_gem_execbuffer] errors posted above, but that seems unrelated to this issue.

Any idea of what could have fixed it? All upcomming distros will ship with kernel 2.6.31, so it would still be nice to know what fixed it and be able to backport it, if possible.

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Interesting... no I'm not sure what may have fixed it offhand. Would it be too much trouble to bisect it? There have been some fixes to somewhat related areas, but nothing that should directly affect the stuck hotplug interrupt afaik...

Revision history for this message
In , gneman (luis6674) wrote :

I've just pulled from git again and bad news: I can see the problem again. I don't know why it seemed to be fixed some days ago, but I probably did something wrong. Sorry about the false report.

I'm not sure I'll have the time and knowledge to perform a bisect, but in case I can do it, would it be worth to bisect between .29 and .30 to see the commit that introduced the problem? Or is it not too relevant at this point?

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

No, you don't need to bisect for the bad commit; I think I know what it's related to. I was hoping you could bisect to find the *good* commit, but it sounds like there isn't one. :p

When I get back from travelling I'll dig through all the hotplug errata (apparently there were many) and see if I can come up with a real patch for this.

Revision history for this message
Lars Ljung (larslj) wrote : udevd using 5-10% CPU

Binary package hint: udev

One udevd process is constantly using 5-10% of the CPU. There is also 64 other udevd processes running.

I run strace on the process and it seems to be related to the graphics card (Intel G41)

ProblemType: Bug
Architecture: amd64
Date: Fri Oct 2 09:16:24 2009
DistroRelease: Ubuntu 9.10
MachineType: System manufacturer System Product Name
Package: udev 147~-5
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.31-11-generic root=UUID=61039ef4-e057-4b72-ac4d-9038fc6aa0cc ro crashkernel=384M-2G:64M,2G-:128M quiet splash
ProcEnviron:
 LANGUAGE=sv_SE.UTF-8
 PATH=(custom, user)
 LANG=sv_SE.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.31-11.36-generic
SourcePackage: udev
Uname: Linux 2.6.31-11-generic x86_64
dmi.bios.date: 09/08/2008
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 0310
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: P5QPL-VM
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: Rev X.0x
dmi.chassis.asset.tag: Asset-1234567890
dmi.chassis.type: 3
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0310:bd09/08/2008:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKComputerINC.:rnP5QPL-VM:rvrRevX.0x:cvnChassisManufacture:ct3:cvrChassisVersion:
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Revision history for this message
Lars Ljung (larslj) wrote :
Revision history for this message
Lars Ljung (larslj) wrote :
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Please run "udevadm monitor -e" and attach the output

Changed in udev (Ubuntu):
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
Lars Ljung (larslj) wrote :

OK, 5 seconds of "udevadm monitor -e" gave me this output.

The CPU load actually changes a bit from time to time. Right now it's about 15% and there is only 3 udevd processes. Sometimes the CPU load goes down to being close to zero.

Revision history for this message
Lars Ljung (larslj) wrote :

This bug has also been reported in http://bugs.freedesktop.org/show_bug.cgi?id=23183

Changed in udev (Ubuntu):
status: Incomplete → New
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Thanks,

Your log shows that it's not udev that's the issue, but that the kernel is sending an extreme number of change events for one of your devices - udev needs the CPU to keep up!

I think these are graphics card crash/faults?

affects: udev (Ubuntu) → linux (Ubuntu)
summary: - udevd using 5-10% CPU
+ spam of change events from drm/card0
Revision history for this message
In , gneman (luis6674) wrote :

Small update: I have connected my monitor using the HDMI connector on my card (with a HDMI to DVI adapter) and I don't get interrupts anymore. Not sure if this was expected, but thought I should report it just in case.

Previously I was connecting my monitor through VGA.

Let me know if you need any further info.

Revision history for this message
In , gneman (luis6674) wrote :

Well, after _days_ of using for many hours the computer with the monitor connected via HDMI, the bug has showed its face again. So using HDMI doen't completely solve the problem, but it makes it much more difficult to trigger (it just took 5-10 minutes of usage before).

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Created an attachment (id=30988)
Check hotplug status bits

Looks like we were checking the wrong bits in the interrupt handler. Can you give this patch a try?

Revision history for this message
In , gneman (luis6674) wrote :

Unfortunately it doesn't seem to help. First I tested on current git but DRI was not working for some reason, so while I couldn't reproduce the bug I thought it was not a good test. So then I tested the patch on 2.6.31.5 (it applied with a trivial change) and there I could reproduce the bug in a few minutes (using the VGA connector).

I'll try to retest on current git again to be sure (if I find the reason why DRI didn't work).

Revision history for this message
In , gneman (luis6674) wrote :

Ok, the DRI problem was a stupid typo in the boot parameters, so now it booted fine and just playing a Tux Racer game made the problem show up (even on HDMI).

So the patch really doesn't help :(

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Created an attachment (id=31000)
Handle spurious interrupts

Ok maybe we need to use both sets; if we get an interrupt on a port but the live bit isn't set we should disable the port.

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Created an attachment (id=31001)
Handle spurious interrupts #2

Oops, last one had live vs. hotplug interrupts in the wrong order.

Revision history for this message
Timo Wiren (timo-wiren) wrote :

I'm affected by this bug on Kubuntu 9.10 64-bit, Intel G45 (X4500HD). udevd eats over 20 % of my Core2 Duo 2,5 GHz.

Revision history for this message
In , gneman (luis6674) wrote :

Sorry for the bad news, this one didn't help either. I could trigger the interrupt storm within a few minutes of usage via VGA.

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Created an attachment (id=31133)
IRQ debug patch

Can you guys reproduce the problem with this patch applied and attach the output to this bug? I had some similar data awhile back but I lost it, and I need new theories so I want to see the initial problem data again. Thanks.

Revision history for this message
In , gneman (luis6674) wrote :

Created an attachment (id=31146)
dmesg with HDMI stuck interrupts

Here is a full dmesg with the patch applied. Let me know if you need further information.

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

I wonder if DP_D is supposed to be enabled on your system at all... Can you try the patchset at https://bugs.freedesktop.org/show_bug.cgi?id=22785? It may need a refresh, I'll ping the author.

Revision history for this message
AmenophisIII (amenophisiii) wrote :

i agree with scott. this is a kernel bug in the intel graphics driver.
i think its related with power management, it seems to be correlated with putting the monitor into standby.

and the linked freedesktop bug is not related imho (at least the output is completely different).

quite annoying bug.. i have udev disabled now most of the time, because that load outweighs the benefits for me.

Revision history for this message
In , gneman (luis6674) wrote :

Ok, I'll try those patches when the author posts the refreshed ones.

One thing I noticed is that xrandr -q reports this:

VGA1 disconnected (normal left inverted right x axis y axis)
DVI1 connected 1680x1050+0+0 (normal left inverted right x axis y axis) 474mm x 296mm
   1680x1050 60.0*+
   1280x1024 75.0
   1024x768 75.1 60.0
   800x600 75.0 60.3
   640x480 75.0 60.0
   720x400 70.1
DP1 disconnected (normal left inverted right x axis y axis)

But the DVI1 that appears connected is in fact an HDMI one. When I connect through VGA is also reports that I have a DVI output (disconnected in that case) but this computer does not have DVI at all.

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Hm ok, well maybe the child device patchset will help after all...

Revision history for this message
In , gneman (luis6674) wrote :

I've tested the child device patchset and it did work correctly in detecting my HDMI output as HDMI plus not detecting an inexistent DP (I posted about it on the bug report).

However, that didn't change the situation regarding the interrupt storm. I could easily reproduce the problem by playing tuxracer :(

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Do you know which outputs where detected? I'm thinking if DP_D wasn't created, we should also disable interrupts from that source rather than enabling all of them...

Revision history for this message
In , gneman (luis6674) wrote :

I didn't apply the debug patch in my last test, but from my Xorg.0.log:

(II) intel(0): Integrated Graphics Chipset: Intel(R) G45/G43
(--) intel(0): Chipset: "G45/G43"
(II) intel(0): Output VGA1 has no monitor section
(II) intel(0): Output HDMI1 has no monitor section
(II) intel(0): Output VGA1 disconnected
(II) intel(0): Output HDMI1 connected
(II) intel(0): Using exact sizes for initial modes
(II) intel(0): Output HDMI1 using initial mode 1680x1050

Also xrandr reports only a VGA1 disconnected and a HDMI1 connected.

Should I apply the last debug patch and send the logs?

I was also going to try those two previous patches you posted here with the child device ones applied, since I thought that maybe they didn't work just because HDMI was being detected as DVI.

Revision history for this message
In , gneman (luis6674) wrote :

I tested the previous patches with the child device ones and it didn't work either.

Revision history for this message
Timo Wiren (timo-wiren) wrote :

I can confirm that 2.6.32 did not fix this bug.

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Created an attachment (id=31712)
debug output init

Can you apply this patch and attach the output from when you load with drm debug=6? I'm hoping the DP output causing problems is ignored; if so I can fix up the hotplug code to handle that case.

Revision history for this message
In , gneman (luis6674) wrote :

This patch doesn't apply on top of 2.6.32 and i can't seem to find anything similar in the source code to apply it manually. What should I do to test it?

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

It should apply to Eric's drm-intel-next branch.

Revision history for this message
In , gneman (luis6674) wrote :

I'm trying to get something useful but even assuming I built the kernel correctly with the drm-intel-next branch (at least the patch applied and the kernel does work), when I boot with drm.debug=6 and try to get the dmesg I just get this line repeated all the time:

[ 40.881495] [drm:i915_add_request], 2242
[ 40.886500] [drm:i915_add_request], 2243
...

I tried to get dmesg without starting X, but again it is flooded by this:

[ 32.375745] [drm:i915_driver_irq_handler], hotplug event received, stat 0x38200000
[ 32.376571] [drm:i915_driver_irq_handler], hotplug event received, stat 0x30200000

Any idea of how to avoid this messages flooding the log so I can get it from the start?

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

You could try drm debug=4 instead, I think that'll dump fewer messages.

Revision history for this message
In , gneman (luis6674) wrote :

Created an attachment (id=31761)
DRM debug log

Yes, that worked. Here is the log with drm.debug=4.

Revision history for this message
Mikko Rantalainen (mira) wrote :

Confirming for Ubuntu 9.10 64-bit, intel G45/X4500, Acer Veriton M670G, 2.6.31-9-rt #152-Ubuntu SMP PREEMPT RT

udevd randomly starts to eat CPU (sometimes takes about one core of E8400, right now udevd takes about 1-2% even though I constantly see drm change events in the udevadm monitor)
The udevadm monitor output is exactly the same as Lars Ljung reported above (time stamp and SEQNUM differ, of course)

I have following in dmesg output (not sure if this is related to this issue):
[180704.824413] [drm:i915_handle_error] *ERROR* EIR stuck: 0x00000010, masking
[190136.427589] [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[190136.427594] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -28
[190136.449042] [drm:i915_gem_evict_something] *ERROR* inactive empty 1 request empty 1 flushing empty 1
[190923.547112] [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[190923.547116] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -28
[190923.556727] [drm:i915_gem_evict_something] *ERROR* inactive empty 1 request empty 1 flushing empty 1
[192597.505866] [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[192597.505871] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -28
[192597.512386] [drm:i915_gem_evict_something] *ERROR* inactive empty 1 request empty 1 flushing empty 1
[192672.278308] [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[192672.278312] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -28
[192672.285969] [drm:i915_gem_evict_something] *ERROR* inactive empty 1 request empty 1 flushing empty 1
[192848.889928] [drm:i915_gem_object_bind_to_gtt] *ERROR* GTT full, but LRU list empty
[192848.889932] [drm:i915_gem_object_pin] *ERROR* Failure to bind: -28
[192848.897609] [drm:i915_gem_evict_something] *ERROR* inactive empty 1 request empty 1 flushing empty 1

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Created an attachment (id=31953)
enable hotplug only for detected outputs

I didn't include all the output I wanted, but I'm hoping this is what you were running into.

This patch only enables hotplug detection for outputs we actually initialize, so should minimize the chance of getting interrupts for outputs that don't exist. I also found a note about DP_D in some recent that I'll check out, it could also be what you're hitting.

Revision history for this message
Nicholas J Kreucher (kreucher) wrote :

looks like the folks upstream are actively working on this...lets hope for a solution soon!

Revision history for this message
In , gneman (luis6674) wrote :

This one looks REALLY good. I've been trying for an hour to reproduce the problem by all means and I've been unable. The problem is 95% reproducible within 5-10 minutes, so I'm almost certain that this patch fixed it. Thanks! :)

Anyway I'll keep testing tomorrow (it's late here) and report back with a 100% definitive answer.

Revision history for this message
In , gneman (luis6674) wrote :

Ok, so I've built the same kernel from drm-intel-next without the patch and there I can easily reproduce the problem by simply running glxgears. On the patched kernel there is no way to reproduce it, so now I'm certain that this patch fixes the problem here.

Thank you for all your effort into solving this issue!

Side note: In case this patch is a candidate for being backported, I wonder if it depends on the other patches that make my outputs being correctly detected. Up to (and including) 2.6.32, 3 outputs are detected here: VGA, DVI and DP, but I just have a VGA and a HDMI outputs. In drm-intel-next (therefor 2.6.33, I assume), the outputs are detected correctly as VGA and HDMI. Just in case it matters.

If you'd like me to test any backport or if you want me to send any logs from the patched latest kernel, please let me know.

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Thanks a lot for testing and confirming.

Yeah, it does depend on correct output detection, which is only present in git (so it'll land in 2.6.33). I'll post it for review now.

Revision history for this message
Lars Ljung (larslj) wrote :

The patch attached to comment #35 in the upstrream bug (i915-hotplug-per-output-fix.patch) does resolve this issue on my system. Udevd mostly sleeps and udevadm monitor is silent. I'm attaching a patch that works with the Ubuntu kernel (karmic, 2.6.31-16).

Changed in linux:
status: Unknown → Confirmed
Revision history for this message
Nicholas J Kreucher (kreucher) wrote :

applied the patch from Lars, but unfortunately the problem still remains for me :( I only recompiled the i915.ko module, did I miss something? Using 2.6.31-16-generic (karmic)

KERNEL[1261390492.215156] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1261390492.223522] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1261390492.244037] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1261390492.244347] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1261390492.247234] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1261390492.249508] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1261390492.505867] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1261390492.508815] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1261390492.516465] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1261390492.519207] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1261390492.524250] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1261390492.524317] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1261390492.528401] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1261390492.530405] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1261390492.534601] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1261390492.534704] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1261390492.781319] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1261390492.785480] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1261390492.800013] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1261390492.804435] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1261390492.814179] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1261390492.814261] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1261390492.814314] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1261390492.815494] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1261390492.824270] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1261390492.831478] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1261390492.832415] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1261390492.833401] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1261390492.973502] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1261390492.977126] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

commit b01f2c3a4a37d09a47ad73ccbb46d554d21cfeb0

drm/i915: only enable hotplug for detected outputs

Fix on its way upstream.

Revision history for this message
Slavius (slavomir-danas) wrote :

The patch (after some modifications) did work for me. What I did was:
1) download Ubuntu kernel
2) unpack
3) apply patch
4) make modules
5) copy module to /lib/modules/kernel-version/updates/some/location/i915.ko
6) depmod -a
7) dpkg-reconfigure initramfs-tools
8) reboot
9) verify loaded module by `modinfo i915`

However, when using patched i915.ko I got very poor performance. Single low-quality youtube video fires CPU to 100% (load average is over 2.25) and my system is unusable.

I hope this issue's going to get resolved as I'm not planning to switch to Windows (although it came preinstalled on my notebook)...
 :(

Changed in linux:
status: Confirmed → Fix Released
Changed in linux (Ubuntu):
status: New → Triaged
Revision history for this message
popinet (popinet) wrote :

Same problem here, please also see:

http://ubuntuforums.org/showthread.php?p=8774963#post8774963

When will a fix be available? This is a serious problem on a very common graphics card...

Thanks for your help

Revision history for this message
Timo Wiren (timo-wiren) wrote :

This seems to be fixed (I'm running it right now) in Linux 2.6.32.8: "drm/i915: only enable hotplug for detected outputs"

Revision history for this message
Timo Wiren (timo-wiren) wrote :

I was wrong, 2.6.32.8 did NOT fix this for my G45.

Revision history for this message
sergio (serge-simon) wrote :

I have exactly the same problem (Karmic Koala x86_64, i915), and it's pretty anoying ...

 udevadm monitor
monitor will print the received events for:
UDEV - the event which udev sends out after rule processing
KERNEL - the kernel uevent

KERNEL[1266106984.081222] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1266106984.082627] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1266106984.082664] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1266106984.083914] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1266106984.083937] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1266106984.084077] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1266106984.085499] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1266106984.086698] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1266106984.086729] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1266106984.088223] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1266106984.093065] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV [1266106984.094336] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
KERNEL[1266106984.106143] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)

(... flood)

Revision history for this message
Slavius (slavomir-danas) wrote :

Personally I don't think this is going to be 100% resolved because I've noticed very similar behavior in Windows on the same box too. I've been testing this for a long time and it seems that after I boot up Windows remaining time of the battery shows around 5h. Then I do some heavy graphical processing (watching flash video, enabling and using Aero) and the battery life suddenly drops to around 2h. This state persist until the battery is completely drained. It looks like an Intel firmware bug to me and all effort to solve this on the Linux kernel side will be more or less a hack/workaround...

Revision history for this message
RickRichardson (rick-richardson-gmail) wrote :

i had this problem until I ran :

sudo service udev start

I tried all sorts of things, killing udev and related processes, nothing worked, I had constant disk access and cpu usage. I saw someone recommend starting then stopping the udev service... oddly, simply starting it was enough.

I've got an hp mini 311

Revision history for this message
Nicholas J Kreucher (kreucher) wrote :

http://bugs.freedesktop.org/show_bug.cgi?id=25259 seems to be closer to this bug then the current remote bug watch (despite the title), and also includes a proposed patch to stop the interrupt storm: http://bugs.freedesktop.org/attachment.cgi?id=32825. I will try it out and report back.

Red Hat also has a bug for it here: https://bugzilla.redhat.com/show_bug.cgi?id=528312

Revision history for this message
In , gneman (luis6674) wrote :

I have just upgraded to 2.6.33 hoping to finally leave this bug behind, but found that it's still there. However, it's probably just because the outputs are not correctly detected.

This computer only has 2 outputs: VGA (not used) and HDMI (used). But this is what "xrandr -q" says:

VGA1 disconnected (normal left inverted right x axis y axis)
HDMI1 connected 1680x1050+0+0 (normal left inverted right x axis y axis) 474mm x 296mm
   1680x1050 60.0*+
   1280x1024 75.0 60.0
   1152x864 75.0
   1024x768 75.1 60.0
   800x600 75.0 60.3
   640x480 75.0 60.0
   720x400 70.1
DP1 disconnected (normal left inverted right x axis y axis)

And it's that detected (and therefor initialized) DP output which causes the trouble (or so is my understanding).

At some point detection worked good with drm-intel-next branch (detecting only the 2 existing ones), but with 2.6.33 it again detects a non-existent DP.

Any ideas? Should I open a new report for this thing?

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Yeah, please open a new one. Would be especially good if you could bisect where things went bad.

Revision history for this message
In , gneman (luis6674) wrote :

I guess no need to bisect. I found that the child device patches were reverted by another commit (6207937d4feea000913e8ca23fe20c7744be7847) because they caused trouble for other people. I posted on the relevant report (bug #22785) so I hope that Zhao Yakui can look into another solution.

Revision history for this message
Nicholas J Kreucher (kreucher) wrote :

After 2 days of testing, this patch seems to fix the issue for me. No more spam of change event messages in udev, and udev cpu usage is back to normal.

Note, it's unclear to me how the patch would effect folks using, say, HDMI outputs with this chip. Someone who has such hardware needs should test.

Attaching diff against ubuntu karmic kernel git tree.

Revision history for this message
Mikko Rantalainen (mira) wrote :

Nicholas: is that patch really correct? The proposed patch at freedesktop.org uses "|=" operator with value zero and your patch uses operator "=" with value zero. As a result, your patch will reset all bits in the bit mask hotplug_supported_mask opposed to doing practically nothing in the freedesktop.org patch (any number X binary OR with zero is always X).

Obviously, the patch will prevent any HDMI hotplugging support but I think thats better than the udevd eating too much CPU and flooding the logs.

I believe the correct fix is in the https://bugs.freedesktop.org/show_bug.cgi?id=23183 but it has been reverted from the kernel because of issues with some other (IMHO broken) hardware (https://bugs.freedesktop.org/show_bug.cgi?id=22785). The final fix would be to use current code with some selected BIOSes and the correct fix above for the rest. Unfortunately the correct fix requires non-trivial changes in the kernel and as such is not easy to backport, if I've understood correctly.

Revision history for this message
Nicholas J Kreucher (kreucher) wrote :

Mikko -- you are correct, very embarrassing :( It didn't end up working anyway, but I'm not convinced i'm using the new modules I'm compiling--perhaps they are loaded from initrd? I wasn't making a new one each time. I've added a unique MODULE_VERSION this time to be double sure (is there another way to tell?).

The patch you reference seems to be the same one Lars said worked way back in comment #12. It didn't work for me back then, but i wasn't generating a new initrd then either. Slavius in #14 did, and it seemed to mostly work for him.

Anyway, I've attempted to re-adopt https://bugs.freedesktop.org/show_bug.cgi?id=23183 for karmic 2.6.31-20. Assuming my T400 isn't using one of the broken BIOSes, it should work eh?

Will post results and adopted patch if it indeed does work.

Revision history for this message
Nicholas J Kreucher (kreucher) wrote :

I've collected the following patches together and managed to apply them to ubuntu-karmic.git. Fair warning: I'm not a expert in this area so I didn't do any sanity checking beyond a clean make...

* drm/i915: parse child device from VBT
* drm/i915: Don't set up DP ports that aren't in the BIOS device table.
* drm/i915: Don't set up HDMI ports that aren't in the BIOS device table
* drm/i915: only enable hotplug for detected outputs

Compiled against 2.6.31-20-generic-pae using:
make -C /lib/modules/`uname -r`/build M=$(pwd) modules

Seems to work... udev is quiet, and xrandr no longer includes non-existent devices. Second monitor still works as well.

However, this may break things for some folks as suggested in https://bugs.freedesktop.org/show_bug.cgi?id=22785.

Revision history for this message
Mikko Rantalainen (mira) wrote :

Nicholas: your patch looks sensible to me.

I guess I should mention that I have an interesting situation because my work computer is suffering from this (change events flood) issue but my home computer has Intel DG45ID board, referenced above as an example of a board that has troubles with the "correct" detection code (http://bugzilla.kernel.org/show_bug.cgi?id=14854). The DG45ID does work correctly with current code (though xrandr has non-existing outputs, if I remember correctly). However, having dealt with other BIOS issues with DG45ID, I'd say that I'd like to get this patch in and then apply workarounds for selected BIOSes as needed. The BIOS in DG45ID can be assumed to be always incorrect in case of doubt...

Revision history for this message
Mikko Rantalainen (mira) wrote :

The patch http://bugzilla.kernel.org/attachment.cgi?id=24339 contains the code that can be removed in case BIOS is not used for the VBT setup. This code has been currently dropped from the vanilla kernel, if I've understood correctly. I think the best fix would be to keep this code but to skip VBT of it if a known bad BIOS is detected.

If I'm reading the code correctly, then both dp_is_present_in_vbt() and hdmi_is_present_in_vbt() should be modified to always return 1 in case of a known broken BIOS. It could be wise to add a kernel flag to allow forcing the BIOS as broken to have a quick fix for a BIOS that is later found to be broken (such a flag could be used until the kernel is teached about such BIOS).

Revision history for this message
Andre Costa (blueser) wrote :

Hi,

I just installed 10.04 x86_64 from scratch on a DG43GT, and this hit me as well. *Very* annoying, pretty much defeats the purpose of having a Core 2 Duo Quad :-(

I disabled udev and "event storm" has ceased. Of course, this is not ideal, I hope this gets fixed soon.

tags: added: patch
Changed in linux:
importance: Unknown → Medium
Revision history for this message
Gabriel Mazetto (brodock) wrote :

I'm getting this anoying messages too, but at my computer it's consuming a lot more then 5, 10% (usualy like 40 or 45%).

Doing a sudo udevadm --env I get lot of this kind of messages:

UDEV [1284732819.189823] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV_LOG=3
ACTION=change
DEVPATH=/devices/pci0000:00/0000:00:02.0/drm/card0
SUBSYSTEM=drm
HOTPLUG=1
DEVNAME=/dev/dri/card0
DEVTYPE=drm_minor
SEQNUM=79961226
ACL_MANAGE=1
MAJOR=226
MINOR=0
DEVLINKS=/dev/char/226:0

UDEV [1284732819.192016] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV_LOG=3
ACTION=change
DEVPATH=/devices/pci0000:00/0000:00:02.0/drm/card0
SUBSYSTEM=drm
HOTPLUG=1
DEVNAME=/dev/dri/card0
DEVTYPE=drm_minor
SEQNUM=79961227
ACL_MANAGE=1
MAJOR=226
MINOR=0
DEVLINKS=/dev/char/226:0

KERNEL[1284732819.193128] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV_LOG=3
ACTION=change
DEVPATH=/devices/pci0000:00/0000:00:02.0/drm/card0
SUBSYSTEM=drm
HOTPLUG=1
DEVNAME=dri/card0
DEVTYPE=drm_minor
SEQNUM=79968062
MAJOR=226
MINOR=0

^CUDEV [1284732819.195876] change /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
UDEV_LOG=3
ACTION=change
DEVPATH=/devices/pci0000:00/0000:00:02.0/drm/card0
SUBSYSTEM=drm
HOTPLUG=1
DEVNAME=/dev/dri/card0
DEVTYPE=drm_minor
SEQNUM=79961228
ACL_MANAGE=1
MAJOR=226
MINOR=0
DEVLINKS=/dev/char/226:0

Revision history for this message
Andre Costa (blueser) wrote : Re: [Bug 440411] Re: spam of change events from drm/card0

On Fri, Sep 17, 2010 at 11:18, BrodocK <email address hidden> wrote:
> I'm getting this anoying messages too, but at my computer it's consuming
> a lot more then 5, 10% (usualy like 40 or 45%).
>
> Doing a sudo udevadm --env I get lot of this kind of messages:
>
> UDEV  [1284732819.189823] change   /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
> UDEV_LOG=3
> ACTION=change
> DEVPATH=/devices/pci0000:00/0000:00:02.0/drm/card0
> SUBSYSTEM=drm
> HOTPLUG=1
> DEVNAME=/dev/dri/card0
> DEVTYPE=drm_minor
> SEQNUM=79961226
> ACL_MANAGE=1
> MAJOR=226
> MINOR=0
> DEVLINKS=/dev/char/226:0
>
> UDEV  [1284732819.192016] change   /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
> UDEV_LOG=3
> ACTION=change
> DEVPATH=/devices/pci0000:00/0000:00:02.0/drm/card0
> SUBSYSTEM=drm
> HOTPLUG=1
> DEVNAME=/dev/dri/card0
> DEVTYPE=drm_minor
> SEQNUM=79961227
> ACL_MANAGE=1
> MAJOR=226
> MINOR=0
> DEVLINKS=/dev/char/226:0
>
> KERNEL[1284732819.193128] change   /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
> UDEV_LOG=3
> ACTION=change
> DEVPATH=/devices/pci0000:00/0000:00:02.0/drm/card0
> SUBSYSTEM=drm
> HOTPLUG=1
> DEVNAME=dri/card0
> DEVTYPE=drm_minor
> SEQNUM=79968062
> MAJOR=226
> MINOR=0
>
> ^CUDEV  [1284732819.195876] change   /devices/pci0000:00/0000:00:02.0/drm/card0 (drm)
> UDEV_LOG=3
> ACTION=change
> DEVPATH=/devices/pci0000:00/0000:00:02.0/drm/card0
> SUBSYSTEM=drm
> HOTPLUG=1
> DEVNAME=/dev/dri/card0
> DEVTYPE=drm_minor
> SEQNUM=79961228
> ACL_MANAGE=1
> MAJOR=226
> MINOR=0
> DEVLINKS=/dev/char/226:0

This bug is a major PITA. Today I had to plug a pendrive here at work
on my Ubuntu 10.04 and, since udev is turned off to avoid this bug, it
was not automatically mounted. I started udev and then plugged the
pendrive, and this time it was properly automounted, but the events
storm started all over again. I had to turn the service off right
after I copied the files.

Regards,

Andre

Revision history for this message
Timo Wiren (timo-wiren) wrote :

I installed Kubuntu 10.10 replacing my old Xubuntu 10.04 and have used it for a few days and haven't been hit by this bug anymore even after running many 3D apps. Mobo: Intel DG45ID, chipset X4500HD.

Revision history for this message
fasmide (fasthud) wrote :

Confirmimg Timo Wiren, i too have the DG45ID motherboard. Updated from ubuntu 9.10 -> 10.10 and haven't seen this bug since, yeeeey! :)

Revision history for this message
Martin Metal (mame-792) wrote :

I am hit by this problem since I have installed 9.10 on Asus UL80AG notebook. The reliably working solution was "sudo killall udevd". After reading the latest posts I have updated to 10.10 but I am afraid I cannot confirm that the bug is gone. Actually, the i915 is still flooding the system with interrupts (one can easy see that in "sudo powertop"). It is true that udevd does not register that signal and "udevadm monitor --env" does not flood the teminal with change event from card0.

The symptoms after upgrading to 10.10 is that the whole system is very very slow. Actually, moving a cursor using mouse/touchpad is a pain, it moves very jerky. Cairo-Dock virtually unusable, and the powertop is reporting flood of interrupts from i915.

Well, in 10.04 helped to kill udevd, but what now? What shall I do to get the performance back to usable level? Any help?

Revision history for this message
Patrick Bartels (p4ddy-b) wrote :

Recently I've bought a Lenovo SL510 and I can confirm that the bug still exists.
I'm running Arch Linux and have tried various combinations of different kernels (2.6.36, 2.6.37 and even 2.6.38-rc2) and xorg intel drivers. Thus it is not related to Ubuntu but seems to be a serious issue among various Linux distributions.
It is VERY annoying since the system becomes almost unusable after a few minutes.
Interesting side note: Even after stopping X, i915 still causes ca. 1500 to 1800 wakeups per second.
Any chance to get rid of that bug?

Changed in linux:
importance: Medium → Unknown
Changed in linux:
importance: Unknown → Medium
Revision history for this message
Fuujuhi (fuujuhi) wrote :

Could someone tell me if she/he has a fix/workaround for this bug on *Lucid* ?
Can I kill the thing that generates the event? Can I tell some driver to completely ignore it?
I don't care if that's a manual workaround, as long as it remains on Lucid /

I tried to install Kernel 2.6.35, but this was even more awful (keyboard jammed so much that there a flood of events). Also I don't want to use another kernel, because then it's trading for another set of issues.

Also, does somebody knows what the bug is really about? Ok, that's an event from the graphic card (i915), occurs after suspend. But what is this event about?

Revision history for this message
Martin Metal (mame-792) wrote :

This annoying problem still persist. Is there a fix or workaround available by now?

Thanks!

Revision history for this message
Brad Figg (brad-figg) wrote : Unsupported series, setting status to "Won't Fix".

This bug was filed against a series that is no longer supported and so is being marked as Won't Fix. If this issue still exists in a supported series, please file a new bug.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.