Ubuntu

nvidia: Xorg hangs at 100% CPU when playing video: "fallen off the bus"

Reported by Adam Porter on 2012-05-27
This bug report is a duplicate of:  Bug #973096: Nvidia driver causes xorg crash. Edit Remove
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers (Ubuntu)
Undecided
Unassigned

Bug Description

This is likely a dupe of bug #1001007, but apport-collect says it's better to file a new bug, so here I go. If they are a dupe, it may not be just video playing that causes it, or just 3D usage, but maybe VDPAU or a combination thereof. I do use 3D compositing with KDM in KDE 4.8.3, and I have VDPAU installed.

I have been running Kubuntu on this Dell XPS M1330 laptop with NVIDIA 8400M GS since 2008 with no problems. Suddenly, after upgrading from 11.10 Oneiric to 12.04 Precise, I am regularly experiencing X hanging with 100% CPU. So far, every time it has happened, it's been while watching YouTube videos in Flash in Firefox. It may be minutes, hours, or days between crashes.

When it happens, sometimes the cursor is movable, and sometimes the cursor disappears. The rest of the screen freezes, but sound continues playing. I can SSH in and see X at 100% CPU. I cannot change VTs. I can sometimes SAK+K several times to kill X and then KDM and X will restart--other times I have to power off.

I just downgraded to 295.20 to see if the bug exists in this version--I've read in some places that people find 295.20 to be stable. Note that even though I am filing this bug and ran apport under 295.20, I have experienced the bug in 295.40 and 295.49, the versions in precise and precise-updates.

When the hang happens, I see this in dmesg every time:

[ 2140.551099] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
[ 2140.551107] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
[ 2140.551133] NVRM: os_pci_init_handle: invalid context!
[ 2140.551139] NVRM: os_pci_init_handle: invalid context!
[ 2140.551206] NVRM: os_pci_init_handle: invalid context!
[ 2140.551224] NVRM: os_pci_init_handle: invalid context!
[ 2140.551229] NVRM: os_pci_init_handle: invalid context!
[ 2142.079045] irq 16: nobody cared (try booting with the "irqpoll" option)
[ 2142.079051] Pid: 0, comm: BFS/0 Tainted: P C O 3.3.6-pf-adp+ #6
[ 2142.079054] Call Trace:
[ 2142.079061] [<c154ddf6>] ? printk+0x2d/0x2f
[ 2142.079067] [<c10b4139>] __report_bad_irq+0x29/0xd0
[ 2142.079070] [<c10b43ae>] note_interrupt+0x11e/0x1d0
[ 2142.079174] [<f9bad083>] ? nv_kern_isr+0x33/0x70 [nvidia]
[ 2142.079178] [<c10b21ee>] handle_irq_event_percpu+0x9e/0x200
[ 2142.079182] [<c10b4cf0>] ? handle_fasteoi_irq+0xd0/0xd0
[ 2142.079186] [<c15569e0>] ? nmi_stack_correct+0x2f/0x34
[ 2142.079190] [<c10269e8>] ? default_spin_lock_flags+0x8/0x10
[ 2142.079194] [<c101df4d>] ? __io_apic_modify_irq+0x7d/0x90
[ 2142.079198] [<c10b238b>] handle_irq_event+0x3b/0x60
[ 2142.079201] [<c10b4c20>] ? unmask_irq+0x30/0x30
[ 2142.079204] [<c10b4c6e>] handle_fasteoi_irq+0x4e/0xd0
[ 2142.079206] <IRQ> [<c155d3b2>] ? do_IRQ+0x42/0xc0
[ 2142.079213] [<c1008638>] ? sched_clock+0x8/0x10
[ 2142.079217] [<c1060ffb>] ? sched_clock_local+0xcb/0x1c0
[ 2142.079221] [<c155d2f0>] ? common_interrupt+0x30/0x38
[ 2142.079225] [<c10600d8>] ? build_sched_domains+0x168/0x7d0
[ 2142.079230] [<c1315fef>] ? arch_local_irq_enable+0x5/0xb
[ 2142.079233] [<c13169cb>] ? acpi_idle_enter_simple+0xf3/0x133
[ 2142.079237] [<c1448d9d>] ? cpuidle_idle_call+0xad/0x250
[ 2142.079241] [<c100174c>] ? cpu_idle+0x9c/0xe0
[ 2142.079244] [<c1531825>] ? rest_init+0x5d/0x68
[ 2142.079249] [<c17f5745>] ? start_kernel+0x357/0x35d
[ 2142.079252] [<c17f517f>] ? loglevel+0x2b/0x2b
[ 2142.079255] [<c17f5078>] ? i386_start_kernel+0x78/0x7d
[ 2142.079257] handlers:
[ 2142.079340] [<f9bad050>] nv_kern_isr
[ 2142.079342] Disabling IRQ #16

Again, this never, ever happened in all these years until I "upgraded" from Oneiric to Precise. Now my laptop is completely unreliable--or, at least, I risk a hang and having to kill all processes whenever I watch a video. Not much of an upgrade. :(

I don't know what to do now. If 295.20 is not stable, should I keep downgrading? I assume I'll eventually run into a version that won't install on Precise, whether due to kernel incompatibilities or other issues. I don't think downgrading the entire system to Oneiric (reinstalling, at that) is a good option, either, as I'll be stuck with older versions of other software, including all of KDE.

Here are some other links that may be relevant:

Thread on nvnews that I think is this same bug:
http://www.nvnews.net/vbulletin/showthread.php?t=178362

Probably not the same bug, but perhaps related in that it may be a regression in 295.33+. The performance regressions may be fixed, but this hang is not:
https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers/+bug/982710

I will do whatever I can to help debug this. This is very frustrating!

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: nvidia-current 295.20-0ubuntu1
ProcVersionSignature: Ubuntu 3.2.0-24.39-generic 3.2.16
Uname: Linux 3.2.0-24-generic i686
NonfreeKernelModules: nvidia
ApportVersion: 2.0.1-0ubuntu7
Architecture: i386
Date: Sat May 26 19:01:42 2012
ProcEnviron:
 LANGUAGE=
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: nvidia-graphics-drivers
UpgradeStatus: Upgraded to precise on 2012-05-06 (20 days ago)

Adam Porter (alphapapa) wrote :
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nvidia-graphics-drivers (Ubuntu):
status: New → Confirmed
Andrew M. (ender-neo) wrote :

If you're desperate to get stability back (like I was,) try installing the 295.53-0ubuntu1~precise~xup1 series drivers from https://launchpad.net/~ubuntu-x-swat/+archive/x-updates .
I've only had them running for a couple of days, and only watched 2-3 hours of HD video since then, but so far no crashes.

What perplexes me is that I have two virtually identical HTPC machines in the house, and only one of them hit this bug.

Same mobo, same proc, same ram, same video card, same ubuntu.

One plays primarily through XBMC with VDPAU, other through mplayer+VDPAU.

mplayer+VDPAU box crashed all the time until 295.53 drivers installed.

Thanks for your reply, Andrew. So far 295.20 hasn't crashed, but the
crashy drivers sometimes went longer without crashing, so no verdict
yet. I will try the one you suggested after I decide on 295.20.

On Tue, May 29, 2012 at 9:20 AM, Andrew M. <email address hidden> wrote:
> If you're desperate to get stability back (like I was,) try installing the 295.53-0ubuntu1~precise~xup1 series drivers from https://launchpad.net/~ubuntu-x-swat/+archive/x-updates .
> I've only had them running for a couple of days, and only watched 2-3 hours of HD video since then, but so far no crashes.
>
> What perplexes me is that I have two virtually identical HTPC machines
> in the house, and only one of them hit this bug.
>
> Same mobo, same proc, same ram, same video card, same ubuntu.
>
> One plays primarily through XBMC with VDPAU, other through
> mplayer+VDPAU.
>
> mplayer+VDPAU box crashed all the time until 295.53 drivers installed.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1005028
>
> Title:
>  nvidia: Xorg hangs at 100% CPU when playing video: "fallen off the
>  bus"
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers/+bug/1005028/+subscriptions

bugbot (bugbot) on 2012-05-31
tags: added: kubuntu

295.20 has crashed, too. Next I'll try 295.53. If that crashes I'll see if
I can try 295.10.

On Thursday, May 31, 2012, bugbot <email address hidden> wrote:

> ** Tags added: kubuntu
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1005028
>
> Title:
> nvidia: Xorg hangs at 100% CPU when playing video: "fallen off the
> bus"
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers/+bug/1005028/+subscriptions
>

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers