[NVIDIA] Monitor(s) black out and session freezes. "NVRM: GPU at 0000:01:00.0 has fallen off the bus."

Bug #882710 reported by palewire
134
This bug affects 27 people
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers (Ubuntu)
Opinion
High
Alberto Milone

Bug Description

I run Ubuntu 11.10 on a Dell desktop with a dual use video card that goes into two monitors. They normally work great and I have no complaints. When I first upgraded to 11.10, I would occasionally have both monitors randomly black out, and I'd have to restart the computer to get them back. Though that problem seemed to go away, and I wrote it off to package upgrades that fixed a bug.

Then, after upgraded unity and a bunch of other packages today it started it again. Three times I've had it black out this morning. I don't know why. Though I often use hotkeys to slide between workspaces, and one of the blackouts happened during a slide.

[Next Actions]
* [tseliot] Raise issue with NVIDIA
* [Unity Engineering] Evaluate if twinview could be triggering a bug in unity?
* Attempt to reproduce the bug outside Unity to see if it can be proved to not be Unity-specific
* Identify if there's a way to reproduce the bug deliberately

---
.proc.driver.nvidia.gpus.0: Error: [Errno 21] Is a directory: '/proc/driver/nvidia/gpus/0'
.proc.driver.nvidia.registry: Binary: ""
.proc.driver.nvidia.version:
 NVRM version: NVIDIA UNIX x86 Kernel Module 280.13 Wed Jul 27 16:55:43 PDT 2011
 GCC version: gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3)
.tmp.unity.support.test.0:

ApportVersion: 1.23-0ubuntu3
Architecture: i386
CompizPlugins: [core,bailer,detection,composite,opengl,compiztoolbox,decor,regex,mousepoll,vpswitch,animation,grid,snap,place,resize,session,gnomecompat,move,imgpng,unitymtgrabhandles,wall,fade,workarounds,expo,ezoom,scale,unityshell]
CompositorRunning: compiz
DistUpgraded: Log time: 2011-10-13 15:00:25.754620
DistroCodename: oneiric
DistroRelease: Ubuntu 11.10
DistroVariant: ubuntu
DkmsStatus:
 nvidia-current, 280.13, 2.6.38-11-generic, i686: installed
 nvidia-current, 280.13, 3.0.0-12-generic, i686: installed
 nvidia-current-updates, 280.13, 3.0.0-12-generic, i686: installed
GraphicsCard:
 nVidia Corporation GT218 [GeForce 210] [10de:0a65] (rev a2) (prog-if 00 [VGA controller])
   Subsystem: eVga.com. Corp. Device [3842:1310]
InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Release i386 (20110427.1)
JockeyStatus:
 xorg:nvidia_current - NVIDIA accelerated graphics driver (Proprietary, Disabled, Not in use)
 xorg:nvidia_current_updates - NVIDIA accelerated graphics driver (post-release updates) (Proprietary, Enabled, In use)
MachineType: Dell Inc. OptiPlex 745
NonfreeKernelModules: nvidia
Package: unity 4.24.0-0ubuntu2b1
PackageArchitecture: i386
ProcEnviron:
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.0.0-12-generic root=UUID=7130e4f2-1020-48a0-807c-882bee26d3c3 ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 3.0.0-12.20-generic 3.0.4
Tags: oneiric running-unity oneiric running-unity oneiric running-unity ubuntu regression-update compiz-0.9
Uname: Linux 3.0.0-12-generic i686
UpgradeStatus: Upgraded to oneiric on 2011-10-14 (13 days ago)
UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare
XorgLogOld:

dmi.bios.date: 08/12/2008
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 2.6.2
dmi.board.name: 0RF705
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 3
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr2.6.2:bd08/12/2008:svnDellInc.:pnOptiPlex745:pvr:rvnDellInc.:rn0RF705:rvr:cvnDellInc.:ct3:cvr:
dmi.product.name: OptiPlex 745
dmi.sys.vendor: Dell Inc.
version.compiz: compiz 1:0.9.6+bzr20110929-0ubuntu5
version.libdrm2: libdrm2 2.4.26-1ubuntu1
version.libgl1-mesa-dri: libgl1-mesa-dri 7.11-0ubuntu3
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A
version.libgl1-mesa-glx: libgl1-mesa-glx 7.11-0ubuntu3
version.nvidia-graphics-drivers: nvidia-graphics-drivers N/A
version.xserver-xorg: xserver-xorg 1:7.6+7ubuntu7
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.6.0-1ubuntu13
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:6.14.99~git20110811.g93fc084-0ubuntu1
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.15.901-1ubuntu2
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:0.0.16+git20110411+8378443-1

Revision history for this message
palewire (ben-welsh) wrote :

Here's what I see in my kernal log around the time of crash. Seems very similar to what's reported in this thread: http://forums.nvidia.com/index.php?showtopic=209151

Oct 27 13:12:49 conrad kernel: [ 557.793689] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
Oct 27 13:12:49 conrad kernel: [ 557.839448] hda-intel: spurious response 0x0:0x0, last cmd=0x470e00
Oct 27 13:12:49 conrad kernel: [ 557.839454] hda-intel: spurious response 0x9000094:0x0, last cmd=0x470e00
Oct 27 13:12:49 conrad kernel: [ 557.839457] hda-intel: spurious response 0x0:0x0, last cmd=0x470e00
Oct 27 13:12:49 conrad kernel: [ 557.839460] hda-intel: spurious response 0x0:0x0, last cmd=0x470e00
Oct 27 13:12:49 conrad kernel: [ 557.839463] hda-intel: spurious response 0x0:0x0, last cmd=0x470e00
Oct 27 13:12:49 conrad kernel: [ 557.839466] hda-intel: spurious response 0x0:0x1, last cmd=0x10420000
Oct 27 13:12:49 conrad kernel: [ 557.839469] hda-intel: spurious response 0x0:0x1, last cmd=0x10420000
Oct 27 13:12:49 conrad kernel: [ 557.839473] hda-intel: spurious response 0x9000094:0x1, last cmd=0x10420000
Oct 27 13:12:49 conrad kernel: [ 557.839476] hda-intel: spurious response 0x0:0x1, last cmd=0x10420000
Oct 27 13:12:49 conrad kernel: [ 557.839479] hda-intel: spurious response 0x0:0x1, last cmd=0x10420000
Oct 27 13:12:49 conrad kernel: [ 557.839482] hda-intel: spurious response 0x0:0x1, last cmd=0x10420000
Oct 27 13:12:49 conrad kernel: [ 557.839485] hda-intel: spurious response 0x0:0x2, last cmd=0x20420000
Oct 27 13:12:49 conrad kernel: [ 557.839488] hda-intel: spurious response 0x0:0x2, last cmd=0x20420000
Oct 27 13:12:49 conrad kernel: [ 557.839491] hda-intel: spurious response 0x9000094:0x2, last cmd=0x20420000
Oct 27 13:12:49 conrad kernel: [ 557.839494] hda-intel: spurious response 0x0:0x2, last cmd=0x20420000
Oct 27 13:12:49 conrad kernel: [ 557.839497] hda-intel: spurious response 0x0:0x2, last cmd=0x20420000
Oct 27 13:12:49 conrad kernel: [ 557.839500] hda-intel: spurious response 0x0:0x2, last cmd=0x20420000
Oct 27 13:12:49 conrad kernel: [ 557.839503] hda-intel: spurious response 0x0:0x3, last cmd=0x30420000
Oct 27 13:12:49 conrad kernel: [ 557.839506] hda-intel: spurious response 0x0:0x3, last cmd=0x30420000
Oct 27 13:12:49 conrad kernel: [ 557.839509] hda-intel: spurious response 0x9000094:0x3, last cmd=0x30420000
Oct 27 13:12:49 conrad kernel: [ 557.839512] hda-intel: spurious response 0x0:0x3, last cmd=0x30420000
Oct 27 13:12:49 conrad kernel: [ 557.839515] hda-intel: spurious response 0x0:0x3, last cmd=0x30420000
Oct 27 13:12:49 conrad kernel: [ 557.839518] hda-intel: spurious response 0x0:0x3, last cmd=0x30420000
Oct 27 13:12:49 conrad kernel: [ 557.839521] hda-intel: spurious response 0x0:0x0, last cmd=0x470e00

Revision history for this message
Da Shroom (9n-georgm-bc) wrote :

Thank you for your bug report, please could you run

apport-collect 882710

in a terminal, to provide the developers with additional information, however, this looks like an issue with a propriety driver, so there may be little that can be done.

Revision history for this message
palewire (ben-welsh) wrote : .proc.driver.nvidia.params.txt

apport information

tags: added: apport-collected compiz-0.9 oneiric regression-update running-unity ubuntu
description: updated
Revision history for this message
palewire (ben-welsh) wrote : BootDmesg.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : BootLog.gz

apport information

Revision history for this message
palewire (ben-welsh) wrote : CurrentDmesg.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : Dependencies.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : DpkgLog.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : GconfCompiz.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : Lspci.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : Lsusb.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : ProcInterrupts.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : ProcModules.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : UdevDb.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : UdevLog.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : UnitySupportTest.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : XorgConf.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : XorgLog.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : Xrandr.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : dmidecode.txt.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : locale.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : nvidia-settings.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : peripherals.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : setxkbmap.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : xdpyinfo.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : xinput.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : xkbcomp.txt

apport information

Revision history for this message
palewire (ben-welsh) wrote : Re: Dual monitors randomly black out, might be linked to switching workspaces

I added all that, but one thing I should note is that before doing that I switched around my "Additional Drivers" setting from NVIDIA accelerated graphics driver (version current) to (post-release updates)

Revision history for this message
palewire (ben-welsh) wrote :

So it was on current at the time of the error.

Revision history for this message
Da Shroom (9n-georgm-bc) wrote :

Ok, Thanks for the information, I will now confirm the bug because of the added information.

Changed in jockey:
status: New → Confirmed
Revision history for this message
Da Shroom (9n-georgm-bc) wrote :

I have added it to Jockey, as that seems the most likely cause of this bug (if it isn't in the proprietary drivers)
Thanks for your help with this bug.

Revision history for this message
palewire (ben-welsh) wrote :

Thanks. If there's anything I can do, let me know. And FYI: The screen blacked out again, later in the day, after I had made the proprietary driver switch. So I don't think it's fixed. I don't think it's the workspaces either. This time it happened when I was running a long database loading script from my terminal

Revision history for this message
Da Shroom (9n-georgm-bc) wrote :

Thanks for your cooperation, whilst this bug is waiting for a developer to join this bug, would you mind making a log of when it happens and happens and what you are running.

Thanks for you assistance

Revision history for this message
Martin Pitt (pitti) wrote :

Jockey installs graphics drivers, but it's not at all involved in the actual driver code or operation.

Changed in jockey:
status: Confirmed → Invalid
Revision history for this message
palewire (ben-welsh) wrote :

Just had it black out on me minutes ago. Immediately after I pushed "SEND" on a Gmail email in Firefox, it just blacked out and went down.

Revision history for this message
palewire (ben-welsh) wrote :

Just happened again. I was writing a harmless IM in Empathy and crappppped out.

Revision history for this message
palewire (ben-welsh) wrote :

Just happened again, immediately after opening a new tab in Firefox. Both screens blank. Audio still plays briefly, and then it blacks out.

Revision history for this message
Da Shroom (9n-georgm-bc) wrote :

Thanks for keeping the log going :-)

Revision history for this message
palewire (ben-welsh) wrote :

Happened again minutes ago. I was listening to a podcast in Banshee and writing some code in gEdit. Byobu was also open. I reloaded a tab in Firefox, one of about five or six open, and as the page loaded the screens blacked out. The sound kept playing but the system was unresponsive until I restarted.

Da Shroom (9n-georgm-bc)
Changed in xorg (Ubuntu):
status: New → Confirmed
Changed in unity (Ubuntu):
status: New → Confirmed
bugbot (bugbot)
affects: xorg (Ubuntu) → nvidia-graphics-drivers (Ubuntu)
Da Shroom (9n-georgm-bc)
Changed in xorg (Ubuntu):
status: New → Confirmed
Bryce Harrington (bryce)
Changed in xorg (Ubuntu):
status: Confirmed → Invalid
Changed in nvidia-graphics-drivers (Ubuntu):
assignee: nobody → Alberto Milone (albertomilone)
importance: Undecided → High
status: Confirmed → Triaged
Bryce Harrington (bryce)
description: updated
description: updated
Martin Pitt (pitti)
no longer affects: jockey
Changed in unity:
status: New → Confirmed
Omer Akram (om26er)
no longer affects: xorg (Ubuntu)
Changed in unity:
importance: Undecided → Medium
Changed in unity (Ubuntu):
importance: Undecided → Medium
Bryce Harrington (bryce)
summary: - Dual monitors randomly black out, might be linked to switching
- workspaces
+ Dual monitors black out and session freezes, might be linked to
+ switching workspaces
Omer Akram (om26er)
tags: added: multimonitor
Omer Akram (om26er)
Changed in unity:
status: Confirmed → Triaged
Changed in unity (Ubuntu):
status: Confirmed → Triaged
Bryce Harrington (bryce)
Changed in unity (Ubuntu):
status: Triaged → Invalid
summary: - Dual monitors black out and session freezes, might be linked to
- switching workspaces
+ Monitor(s) black out and session freezes. "NVRM: GPU at 0000:01:00.0
+ has fallen off the bus."
43 comments hidden view all 123 comments
Revision history for this message
Larry Tate (cathect) wrote : Re: Monitor(s) black out and session freezes. "NVRM: GPU at 0000:01:00.0 has fallen off the bus."

A few days ago there was a NVIDIA driver update. At that time I backed down from the version-current updates and to the version-current [recommended] driver.

That has resolved ALL my issues.

Revision history for this message
Tye (tye3ow) wrote :

@Strange_cathect, what version of the drivers are you using? the 'version-current' in the repos is still 280.13 (unless you're using the x-updates repo)

tye@T:~$ apt-cache policy nvidia-current nvidia-current-updates
nvidia-current:
  Installed: 295.33-0ubuntu1~oneiric~xup1
  Candidate: 295.33-0ubuntu1~oneiric~xup1
  Version table:
 *** 295.33-0ubuntu1~oneiric~xup1 0
        500 http://ppa.launchpad.net/ubuntu-x-swat/x-updates/ubuntu/ oneiric/main i386 Packages
        100 /var/lib/dpkg/status
     280.13-0ubuntu6 0
        500 http://ca.archive.ubuntu.com/ubuntu/ oneiric/restricted i386 Packages
nvidia-current-updates:
  Installed: 280.13-0ubuntu5
  Candidate: 280.13-0ubuntu5
  Version table:
 *** 280.13-0ubuntu5 0
        500 http://ca.archive.ubuntu.com/ubuntu/ oneiric/restricted i386 Packages
        100 /var/lib/dpkg/status

Revision history for this message
Larry Tate (cathect) wrote :

I have 295.33.

I'm updating on: http://ppa.launchpad.net/ubuntu-x-swat/x-updates/ubuntu

Since I started using this I've had zero issues for several days now.
-------------------

nvidia-current:
  Installed: 295.33-0ubuntu1~oneiric~xup1
  Candidate: 295.33-0ubuntu1~oneiric~xup1
  Version table:
 *** 295.33-0ubuntu1~oneiric~xup1 0
        500 http://ppa.launchpad.net/ubuntu-x-swat/x-updates/ubuntu/ oneiric/main i386 Packages
        100 /var/lib/dpkg/status
     280.13-0ubuntu6 0
        500 http://us.archive.ubuntu.com/ubuntu/ oneiric/restricted i386 Packages
nvidia-current-updates:
  Installed: 280.13-0ubuntu5
  Candidate: 280.13-0ubuntu5
  Version table:
 *** 280.13-0ubuntu5 0
        500 http://us.archive.ubuntu.com/ubuntu/ oneiric/restricted i386 Packages
        100 /var/lib/dpkg/status

Revision history for this message
Tye (tye3ow) wrote :

what happens when you use an intensive OpenGL app? XBMC in windowed mode is a good place to start, since you can minimize it and watch your heat, or rather anything using windowed mode+OpenGL at any larger scale. you can watch your heat with this (it's what I have in conky lol):
nvidia-settings -query GPUCoreTemp | grep Attribute | grep -o '[0-9]\{2,3\}'
after a minute or so I'm up to 90+°C and after a few minutes I end up over 100°C
I'm using XBMC in windowed/maximized mode and a desktop resolution of 1920×1080@60Hz with DynaimcTwinView disabled (I use grandr to resize, with DynamicTwinView enabled, XBMC fails to get an output list and I end up with massive stutters)

Revision history for this message
Tye (tye3ow) wrote :

I figured I'd give it a go and see if the problem might be fixed so I fired up some Marble Arena 2 in Desura and played it for a little bit, and then same problem

-------------------
Mar 28 04:42:04 T kernel: [94085.118038] CPU0: Core temperature above threshold, cpu clock throttled (total events = 242827)
Mar 28 04:42:04 T kernel: [94085.118524] CPU0: Core temperature/speed normal
Mar 28 04:47:04 T kernel: [94385.119355] CPU0: Core temperature above threshold, cpu clock throttled (total events = 324658)
Mar 28 04:47:04 T kernel: [94385.119838] CPU0: Core temperature/speed normal
Mar 28 04:50:50 T kernel: [94610.420071] NVRM: GPU at 0000:04:00.0 has fallen off the bus.
Mar 28 04:50:50 T kernel: [94610.420080] NVRM: GPU at 0000:04:00.0 has fallen off the bus.
-------------------

Revision history for this message
Larry Tate (cathect) wrote :

I just had it occur again. I remain "fixed" on the old issues: playing music on Clementine or watching flash videos. However, I just downloaded the Steel Storm game. A few seconds in the problem appears and I have to hard reboot.

Revision history for this message
lynxn0t (lynxn0t) wrote :

Hi Everyone,

My GPU is also falling off the bus ... The funny thing about this message is that the first time i got it, i opened the case to check that the card was still correctly inserted in the bus ;-) Very disturbing issue anyway :-/ and I also started to get it around october/november last year.

I also have the feeling that streaming videos, like youtube in chrome, are kinda triggering this issue.
I also was in a multimonitor configuration, but since a few days I disconnected the second one but just got a black screen.

On a forum I've found a proposed solution for this issue was to use the command "nvidia-smi -pm 1" to enable the persistent mode. I had a few days of peace and just now got it again.

Also tried to disable DPMS as a desperate measure, but did not help.

I also still have a few options about TwinView and metamodes in my xorg.conf which I may comment out if the GPU still fail.
For now I've just disabled Vsync in CCSM and I am waiting to see if it reappears...

About intensive openGl app, I only play Minecraft and I can play for hours normally. But friday I could not play 2 mins without crashing and i had just booted the pc. I will keep an eye on temperatures tho

Anyway as you can see in my log below, my pc was booted only since 800secs when happened, had a youtube video going and minecraft was at the mojang logo screen..nothing in the logs and suddenly the gpu is gone!

-------------------------------
Apr 18 18:39:21 dubu rsyslogd: [origin software="rsyslogd" swVersion="5.8.6" x-pid="1000" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Apr 18 18:39:46 dubu anacron[1294]: Job `cron.daily' terminated
Apr 18 18:39:46 dubu anacron[1294]: Normal exit (1 job run)
Apr 18 18:42:38 dubu kernel: [ 837.113335] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
Apr 18 18:42:38 dubu kernel: [ 837.113386] NVRM: GPU at 0000:01:00.0 has fallen off the bus.

Revision history for this message
lynxn0t (lynxn0t) wrote :

I think for me it was a temperature problem. I made a little script to trace my GPU temp with nvidia-smi tool and as soon as it gets around 110°C the GPU is shutting down.
I made an extensive cleanup of my case and also set all the fans to max rpms ( got a fan controller on my case) now I can watch streaming videos full screen at around 60-62°C.
I understood that the GPU was shutting down because when I switched to "Plug&Play OS" in my BIOS, in the logs I found some new lines:
----------------------------------------------------------------------
Apr 23 20:53:33 dubu kernel: [12090.960976] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
Apr 23 20:53:33 dubu kernel: [12090.960986] NVRM: os_pci_init_handle: invalid context!
Apr 23 20:53:33 dubu kernel: [12090.960988] NVRM: os_pci_init_handle: invalid context!
Apr 23 20:53:33 dubu kernel: [12090.961017] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
Apr 23 20:53:33 dubu kernel: [12090.961021] NVRM: os_pci_init_handle: invalid context!
Apr 23 20:53:33 dubu kernel: [12090.961023] NVRM: os_pci_init_handle: invalid context!
----------------------------------------------------------------------

Thanks to Tye for putting me in the right way ;-)

Revision history for this message
Valeriy (tverdohleb) wrote :

I was executing 10×glxgears and has monitored gpu temperature. On 64—65°C GPU has fallen off the bus. It is abnormal, as my GF 8500 can operate up to 100°C and feel ok, as was before. So the overheat is not in my case and the problem seems to be deeper.

Revision history for this message
Larry Tate (cathect) wrote :

Persists in 12.04....

Revision history for this message
krab1k (racek-t) wrote :

Hi, I have same problem using Nvidia Go 7300 with latest drivers in precise. Relevant part of dmesg output follows:

[ 435.908958] NVRM: os_pci_init_handle: invalid context!
[ 435.908969] NVRM: os_pci_init_handle: invalid context!
[ 435.908988] NVRM: os_map_kernel_space: can't map 0xc0000000, invalid context!
[ 435.909017] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
[ 435.909031] NVRM: os_pci_init_handle: invalid context!
[ 435.909036] NVRM: os_pci_init_handle: invalid context!
[ 435.909045] NVRM: os_map_kernel_space: can't map 0xc0000000, invalid context!
[ 436.944057] irq 16: nobody cared (try booting with the "irqpoll" option)
[ 436.944065] Pid: 1241, comm: Xorg Tainted: P C O 3.2.0-23-generic #36-Ubuntu
[ 436.944068] Call Trace:
[ 436.944077] [<c1561d5f>] ? printk+0x2d/0x2f
[ 436.944083] [<c10b1289>] __report_bad_irq+0x29/0xd0
[ 436.944087] [<c107ad84>] ? tick_handle_oneshot_broadcast+0xf4/0x100
[ 436.944091] [<c10b14e4>] note_interrupt+0x104/0x150
[ 436.944095] [<c10af3ae>] handle_irq_event_percpu+0x9e/0x200
[ 436.944100] [<c1027378>] ? default_spin_lock_flags+0x8/0x10
[ 436.944104] [<c1576d2d>] ? _raw_spin_lock_irqsave+0x2d/0x40
[ 436.944107] [<c10af54b>] handle_irq_event+0x3b/0x60
[ 436.944111] [<c10b1cf0>] ? unmask_irq+0x30/0x30
[ 436.944115] [<c10b1d3e>] handle_fasteoi_irq+0x4e/0xd0
[ 436.944117] <IRQ> [<c157e432>] ? do_IRQ+0x42/0xc0
[ 436.944124] [<c10b5c0d>] ? rcu_irq_exit+0xd/0x10
[ 436.944128] [<c105218c>] ? irq_exit+0x3c/0xa0
[ 436.944132] [<c157e509>] ? smp_apic_timer_interrupt+0x59/0x88
[ 436.944135] [<c157e370>] ? common_interrupt+0x30/0x38
[ 436.944139] [<c1459caf>] ? acpi_pm_read+0xf/0x20
[ 436.944144] [<c1074204>] ? getnstimeofday+0x54/0x120
[ 436.944148] [<c1074316>] ? do_gettimeofday+0x16/0x40
[ 436.944152] [<c10508e3>] ? sys_gettimeofday+0x23/0x70
[ 436.944156] [<c1576ed4>] ? syscall_call+0x7/0xb
[ 436.944158] handlers:
[ 436.944336] [<f9b4b050>] nv_kern_isr
[ 436.944338] Disabling IRQ #16

Revision history for this message
MvW (2nv2u) wrote :

Never happend to me before until 12.04, this combined with bug:
https://bugs.launchpad.net/ubuntu/+bug/980519
Makes the new LTS unusable!

Revision history for this message
Tom Robinson (terobin) wrote :

I have this problem.

I'm using Ubuntu 12.04 64bit and NVIDIA Driver Version: 295.40

I open Minecraft and am able to apply for around 5 minutes and then everything goes black. The problem doesn't seem to affect me at any other time, only when playing Minecraft.

Revision history for this message
Tom Robinson (terobin) wrote :

Sorry, that should say 'play' not 'apply'.

Changed in unity:
status: Triaged → Invalid
summary: - Monitor(s) black out and session freezes. "NVRM: GPU at 0000:01:00.0
- has fallen off the bus."
+ [NVIDIA] Monitor(s) black out and session freezes. "NVRM: GPU at
+ 0000:01:00.0 has fallen off the bus."
no longer affects: unity
no longer affects: unity (Ubuntu)
Revision history for this message
Larry Tate (cathect) wrote :

Has anyone updated on the ex-swat ppa to see if that makes a difference?

Revision history for this message
Tom Robinson (terobin) wrote :

If that is these instructions:

"sudo add-apt-repository ppa:ubuntu-x-swat/x-updates
sudo apt-get update
sudo apt-get install nvidia-current"

(found at http://mygeekopinions.blogspot.co.uk/2011/06/how-to-install-nvidia-2750907-driver-in.html)

Then I tried that but no luck.

Revision history for this message
Bernhard (xro) wrote :

I manually downgraded to NVIDIA 275.43 and the problem is gone, so it appears to be a problem with recent Nvidia drivers.

Revision history for this message
Mikael Karon (mikael-karon) wrote :

@Bernhard exactly how did you do that (have a good ppa with that version around [precise], or did you download/install from nvidia's site)?

Revision history for this message
Sergio Callegari (callegar) wrote :

GPU falls off the bus also on
Ubuntu 12.04 64 bit + Dell Precision T5400 (Nvidia Quadro FX 570).
Nvidia drivers are the latest 302.17.

All this seems to be very well known on the NV mailing lists and forums, which show similar bug reports also for 295.x drivers and other distros.

In my case, the screen does not get black, but merely freezes.
The machine does not hang and remains reachable via ssh.

In my case it does not seem to be temperature related (machine is in an air conditioned room, with little load, almost no graphical load and monitor reports the gpu at 57 °C).

Revision history for this message
Thomas Eschenbacher (thomas-eschenbacher) wrote :

the same here:
nVidia Corporation NV44 [GeForce 7100 GS] (rev a1)
driver 302.17
on Gentoo Linux / 686 (32 bit) / Kernel 3.4.4

syslog is flooded with "NVRM: os_pci_init_handle: invalid context!"
and after a while I get
"NVRM: GPU at 0000:01:00.0 has fallen off the bus."
with same symptoms as described above...

(This forced me to revert back to 290.10, which works rock solid over months!)

Revision history for this message
Tye (tye3ow) wrote :

still present in 12.04 (64bit) with the 295.49 (X-Updates) drivers on the GeForce 210. I don't know if it is a heat issue or if heat generation is just a symptom of whatever issue is causing the problems. I sure would like for the problems to go away though, as this is the second release with the same problems using either the default or the X-Updates drivers and both releases use the 3.x series of kernels. 11.04 is the last release that I could even think about using OpenGL in with any sort of reliability which is very disappointing.

Revision history for this message
Tye (tye3ow) wrote :

oh yes, update: the screen no longer goes black, but instead as Sergio pointed out, it freezes which I suppose might sound like progress, except that the system reboots itself after a few seconds and the problem still renders OpenGL useless.

Revision history for this message
Tye (tye3ow) wrote :

I'm not sure if this is a duplicate of this bug or not:
https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers/+bug/973096

the same indications of both reproduction and anecdotal fixes are present there. I might give the upgraded kernel a try but I loathe using packages so far ahead of the repos and PPAs.

Revision history for this message
Tye (tye3ow) wrote :

apparently not a duplicate as it is only for logouts, not system freezes. I guess we'll see if we garner any attention or intent to actually fix this one now that the other one is actually resolved (although, given the history on this one, I doubt it)

lately my system will not always black out and then restart, sometimes the video simply freezes. others it will black out and after a minute or so it will reset itself. it's all very curious.

Revision history for this message
Larry Tate (cathect) wrote :
Revision history for this message
Larry Tate (cathect) wrote :

Does anyone know if this bug is resolved in 12.10???

Revision history for this message
Thomas Eschenbacher (thomas-eschenbacher) wrote :

As I am using Gentoo Linux I have no idea which version of the driver is in Ubuntu, but after having the same problems for a long time with many 30x.xx nvidia driver versions, I tried an upgrade from Nvidia driver 290.10 (which still worked fine) to 304.60 - which seems to work fine again, I am using that version for several weeks now without any problems.

nVidia Corporation NV44 [GeForce 7100 GS] (rev a1)
driver 304.60
on Gentoo Linux / i686 (32 bit) / Kernel 3.6.1

Revision history for this message
thedanyes (thedanyes) wrote :

I encountered a similar problem on my desktop system running 11.04 with the nVidia proprietary driver. Using Intel DH55TC motherboard and nVidia GTS 450 OR nVidia 9800GT (tried both). For me, the issue was resolved through a motherboard firmware update.

https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers/+bug/733374/comments/73

Revision history for this message
Tye (tye3ow) wrote :

@thedanyes, I'm intrigued, can you play full screen openGL apps for long periods without crashes? some good tests are Marble Arena 2 or XBMC in full screen playing high-res video. if so, do you have links to the two motherboard firmware versions, or at least version numbers of both?

update for me: still experience crashes in high-demand openGL apps

Revision history for this message
Tom Robinson (terobin) wrote :

Hi, I think the problem may be due to overheating. I've added an additional fan pointing directly at the graphics card and made sure my computer has space to vent heat, and I've not had any problems since. I did this around 7 hours ago and the computer has been running since, letting me play games and stream video at the same time on multiple monitors.

Before doing this I was experiencing the crashing on both Linux and Windows 8 running on this pc.

Revision history for this message
Luis Alvarado (luisalvarado) wrote :

No overheating on my part. I have tested the 560 ti with the HDMI cable, with a DVI cable and with a DVI-VGA. All cases have same problem.

I have even tried using different versions, starting from the originals that come with 12.10 to the ones in X-Swat, to the ones in Xorg Edgers. Basically from 304.xx to 313.xx. In all cases I used 64 bit. I even tested 13.04.

My hardware specs are:

Intel Core i7 2600
16 GB RAM
 Intel DZ68DB Motherboard
Intel 128GB SSD
Nvidia 560 TI

This was working fine until 2 weeks ago. Then this precise problem appeared out of nowhere. I tested all drivers version in 12.10 and 13.04 that came with it or were in Swat or Edgers PPA. The only one that is working is nouveau but that is just throwing the towel since the solution should be to actually use the video performance.

Revision history for this message
Luis Alvarado (luisalvarado) wrote :
Revision history for this message
gmhawash (gmhawash) wrote :

I have had this issue as well, with Linux Mint 14, Ubuntu 12.10 and every flavor I tried. It is definitely heat related but still a problem with the driver. The fan on my Nvidia GTX 590 had been high the last few weeks and I've had issues with hard drives lately; so when I reinstalled, I started seeing the problem; I finally took the NVIDIA card out, and vaccumed the junk out of it.
Now the fan is not on, thankfully, and the system does not crash.

However, the fact that this crash started with a reinstall, tells me that the latest drivers are buggy. The system was working fine before even when the fan was on, and the GPU was getting hot. The install of new system and drivers expose this problem, and heat caused the system (Nvidia driver) to fail.

Hope that helps,

Revision history for this message
thedanyes (thedanyes) wrote :

@Tye Sorry I didn't see your inquiry for so long. I can't say for sure which revision I had when I was seeing the problems, but I first noticed they were fixed with the latest Version 0048. There were many entries in the firmware development changelog that indicated problems related to video and PCIe devices with different revisions.
http://downloadmirror.intel.com/20725/eng/TC_0048_ReleaseNotes.pdf

Revision history for this message
Stephane Epardaud (stef-inforealm) wrote :

I have the same issue. Sometimes my system will lock up several times per day. I have driver 304.43 with latest Ubuntu. I never had that issue before the upgrade to the latest Ubuntu.

Revision history for this message
Nukeador (nukeador) wrote :

Same problem here, Ubuntu 12.10 64b, macbook pro 6,2, nvidia GT330M

I've tested nvidia-current, nvidia-current-updates and the two nvidia-experimental drivers.

Same problem, blank screen+full system freeze.

I've tried both solutions described here with no success: http://askubuntu.com/questions/235760/unity-does-not-appear-after-installing-proprietary-nvidia-drivers-gpu-has-falle

I've increase fans speed to see if this coudl be a heat problem.

piotr zimoch (ebytyes)
Changed in nvidia-graphics-drivers (Ubuntu):
status: Triaged → New
status: New → Incomplete
status: Incomplete → Opinion
Revision history for this message
cryptor (cryptor) wrote :

I started having this issue [both monitors black out, no mouse our keyboard input, still able to SSH in over the
network] after upgrading from 11.04 to 12.04. Generally, when I SSH after the freeze/hang, Xorg is taking
100% CPU and I see "EQ overflowing" in the Xorg.0.log.

My setup looks like this.

Ubuntu 12.04
Linux box 3.2.0-51-generic #77-Ubuntu SMP Wed Jul 24 20:18:19 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
NVIDIA Driver Version: 304.88 [Additional Drivers: version current-updates]
Quadro FX 2800M (GPU 0)
Two displays: ViewSonic VX2439 Series (DFP-1), LGD (DFP-0)

I believe that my issue is related to the following in /var/log/syslog (dmesg):

[99101.294734] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
[99101.294742] NVRM: GPU at 0000:01:00.0 has fallen off the bus.

There seem to be lots of threads concerning this error message with NVIDIA hardware, but no universal solution.

http://www.nvnews.net/vbulletin/showthread.php?p=2571522

https://devtalk.nvidia.com/default/topic/567297/linux/linux-3-10-driver-crash/1
https://devtalk.nvidia.com/default/topic/537302/linux/both-screens-black-xorg-at-100-cpu-overflow-errors-in-xorg-0-log-nvidia-driver-310-14-quadro-fx/

http://www.cyberciti.biz/faq/debian-ubuntu-rhel-fedora-linux-nvidia-nvrm-gpu-fallen-off-bus/

http://forums.gentoo.org/viewtopic-t-925156-postdays-0-postorder-asc-start-25.html

I have tried enabling TwinView in xorg.conf ala palewire in #45, but that did not seem to make any difference.
I'm pretty sure the external monitor, or more likely its GPU, is involved because I do not have the problem when
traveling without it.

I have no reliable way to reproduce the issue. However, it does seem to happen often when I am scrolling a
window, such as in a browser or paging in a terminal. Several times the screens have gone blank while I
still have a finger on the mouse scrollwheel. Of course, this will not reliably produce a problem.

WBB

Revision history for this message
Tye (tye3ow) wrote :

it might be an issue with the DVI/HDMI interface, I don't think I had an issue with the VGA port.

it's running headless right now so I can't test it.

Revision history for this message
cryptor (cryptor) wrote :

@Tye

> issue with the DVI/HDMI interface

Are you referring to the physical connection? If so, I doubt that is it. Nothing about the physical
port changed when I upgraded from 11.04 to 12.04. However, just like you, I did not see the
problem on 11.04. I was definitely using the external monitor on 11.04.

BTW, I am still getting both screens black. One of your posts said that you were now getting
a frozen image, but no longer switching to black. So, that is a difference now.

Revision history for this message
cryptor (cryptor) wrote :

I seem to have stumbled on a fairly quick way to generate or reproduce the

"NVRM: GPU at 0000:01:00.0 has fallen off the bus."

error on my system.

I login to Ubuntu (either 3D/compiz or Ubuntu 2D) and then open a "Gnome Terminal".
On my system, this terminal has 50 lines.

$ echo $LINES
50

This terminal can be located on either my X screen primary display (external DFP-1) on on my
non-primary display (internal LGD, DFP-0).

Now, I create some listings usually about 1000 lines or so.

$ ls -alt ~
$ ls -alt /
$ ls -alt /usr/lib

At this point, I have a scrollbar widget that pan back and forth through the listings.
If I scroll rapidly back and forth through these listings (by dragging the scroller widget vigorously up and down) in
the gnome terminal window for 3 or 4 minutes, I will always get the black out and frozen X session.

BTW, I have enabled persistence as recommended elsewhere and I have been using Ubuntu 2D. Still the problem
persists on my M6500 laptop with Quadro FX 2800M GPU.

Ubuntu 12.04
Linux box 3.2.0-51-generic #77-Ubuntu SMP Wed Jul 24 20:18:19 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
NVIDIA Driver Version: 304.88 [Additional Drivers: version current-updates]
Quadro FX 2800M (GPU 0)
Two displays: ViewSonic VX2439 Series (DFP-1), LGD (DFP-0)

Displaying first 40 and last 40 comments. View all 123 comments or add a comment.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.