10de:061f [Dell Precision M6500] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context

Bug #1258231 reported by MvW on 2013-12-05
32
This bug affects 6 people
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-331 (Ubuntu)
Low
Unassigned

Bug Description

About once an hour at least, my computer either locks up (ex. black screen) requiring me to relaunch the window manager loosing all opened X applications, or allows me to move the mouse but won't allow to click anything. Sound continues to play just fine, as do all other background tasks. Switching to TTY1 and switching back to TTY7 will mostly recover the issue by relaunching the shell. I have this issue since Ubuntu 12.10. Syslog entries:
Dec 5 17:44:03 Europa kernel: [ 8187.092473] NVRM: os_pci_init_handle: invalid context!
Dec 5 17:44:03 Europa kernel: [ 8187.092477] NVRM: Xid (0000:01:00): 8, Channel 00000003
Dec 5 17:44:05 Europa kernel: [ 8189.091441] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context

WORKAROUND: running Xubuntu 12.04.3 with nvidia 331.20.

ProblemType: Bug
DistroRelease: Ubuntu 13.10
Package: linux-image-3.11.0-14-generic 3.11.0-14.21
ProcVersionSignature: Ubuntu 3.11.0-14.21-generic 3.11.7
Uname: Linux 3.11.0-14-generic x86_64
NonfreeKernelModules: nvidia wl
ApportVersion: 2.12.5-0ubuntu2.1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: mvanworkum 2639 F.... pulseaudio
 /dev/snd/controlC0: mvanworkum 2639 F.... pulseaudio
 /dev/snd/pcmC0D0p: mvanworkum 2639 F...m pulseaudio
Date: Thu Dec 5 18:14:26 2013
HibernationDevice: RESUME=UUID=d4c4f4ad-64b5-4110-b360-41660844584e
InstallationDate: Installed on 2012-07-03 (519 days ago)
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Release amd64 (20120425)
MachineType: Dell Inc. Precision M6500
MarkForUpload: True
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.11.0-14-generic root=UUID=6600ec3d-d9d2-4333-a5ae-5849e7f58e20 ro quiet splash
RelatedPackageVersions:
 linux-restricted-modules-3.11.0-14-generic N/A
 linux-backports-modules-3.11.0-14-generic N/A
 linux-firmware 1.116
SourcePackage: linux
UpgradeStatus: Upgraded to saucy on 2013-09-06 (90 days ago)
dmi.bios.date: 06/04/2013
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A10
dmi.board.name: 0R1203
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA10:bd06/04/2013:svnDellInc.:pnPrecisionM6500:pvr:rvnDellInc.:rn0R1203:rvr:cvnDellInc.:ct8:cvr:
dmi.product.name: Precision M6500
dmi.sys.vendor: Dell Inc.

MvW (2nv2u) wrote :

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.12 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.13-rc2-trusty/

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-da-key
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
MvW (2nv2u) wrote :

I already tried 3.12 before posting this bug from the mainline, didn't make any difference so I reverted back to the packaged version. Forget to add the tag though.

tags: added: kernel-bug-exists-upstream
MvW (2nv2u) wrote :

I'm not supposed to mark the bug confirmed since I'm the one who posted it.
I didn't test the 3.13 kernel however, the packages from mainline aren't supported in this distribution.

tags: added: latest-bios-a10
tags: added: quantal

M. van Workum, did this problem not occur in a release prior to Quantal?

tags: added: needs-upstream-testing regression-potential
David Anderson (davea42) wrote :

12 core machine running 13.10, Saucy Salamander.
Never saw these problems before 13.10 .

3.11.0-14-generic #21-Ubuntu SMP X86_64
319.32 nvidia gets kernel error
319.60 nvidia does not (so far) get kernel error.

26 November 9AM: software&updates->additional drivers->
       switched to the nvidia 319.60 'proprietary' driver (from
       319.32 'proprietary tested'). Rebooted.

Turned on Seti@home gpu processingand seti@home did a number of GPU
runs without difficulty. That is 10 days, no problem.

5 December 4PM: as a test, switched back to the 319.32 nvidia
       driver. Rebooted.

7 December 4:16AM: Got kernel warnings, the first is.
       /var/log/syslog:Dec 7 04:16:34 Dseti3 kernel: [129416.201395] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
      4800 lines like that. Including a few 'CPU#0 stuck'.

       So that is just 36 hours till a problem arose.
       10AM: Switched back to 319.60 driver. Rebooted.
       Resuming seti@home

This test may mean nothing, but 319.60 does let me do useful GPU work, so far.

David Anderson, so your hardware may be tracked, could you please file a new report by executing the following in a terminal while booted into a Ubuntu repository kernel (not a mainline one) via:
ubuntu-bug linux

For more on this, please read the official Ubuntu documentation:
Ubuntu Bug Control and Ubuntu Bug Squad: https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue
Ubuntu Kernel Team: https://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies#Filing_Kernel_Bug_reports
Ubuntu Community: https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette

When opening up the new report, please feel free to subscribe me to it.

Thank you for your understanding.

MvW (2nv2u) wrote :

Hi christopher, I've bought this computer over 3 years ago and used nothing but Ubuntu on it. As far as I can remember the problems started with the Quantal release (12.10). Just random lockups, sometimes happening more frequent. I've tried clocking down my GPU (Coolbits) to the lowest possible clock, but it still happens.
First I thought it was somehow temperature related, both the GPU and CPU reach 80 degrees Celsius sometimes, but after some testing it even happens when there around 50 degrees as well. I guess it does seem to be related with heavy CPU usage, this seems to trigger the situation more often explaining the fact I suspected the temperature at first.
I'm still wondering if it isn't hardware malfunction, is this a probable cause for these kind of errors?

M. van Workum, for regression testing purposes, could you please test for this in Precise via http://releases.ubuntu.com/precise/ , and just make a comment if reproducible?

MvW (2nv2u) wrote :

I've been running 12.04.3 for a couple of days now and unfortunately the error came up again.
It does happen less frequent though, this is the first time, as before I only managed to get by a couple of hours at best.

Dec 25 14:22:31 Europa kernel: [ 4824.196812] NVRM: os_pci_init_handle: invalid context!
Dec 25 14:22:31 Europa kernel: [ 4824.196817] NVRM: os_pci_init_handle: invalid context!
Dec 25 14:22:31 Europa kernel: [ 4824.196819] NVRM: os_pci_init_handle: invalid context!
Dec 25 14:22:31 Europa kernel: [ 4824.196821] NVRM: os_pci_init_handle: invalid context!
Dec 25 14:22:31 Europa kernel: [ 4824.196823] NVRM: os_pci_init_handle: invalid context!
Dec 25 14:22:31 Europa kernel: [ 4824.196825] NVRM: os_pci_init_handle: invalid context!
Dec 25 14:22:31 Europa kernel: [ 4824.196828] NVRM: os_pci_init_handle: invalid context!
Dec 25 14:22:31 Europa kernel: [ 4824.196830] NVRM: os_pci_init_handle: invalid context!
Dec 25 14:22:31 Europa kernel: [ 4824.196832] NVRM: os_pci_init_handle: invalid context!
Dec 25 14:22:31 Europa kernel: [ 4824.196834] NVRM: os_pci_init_handle: invalid context!
Dec 25 14:22:31 Europa kernel: [ 4824.196835] NVRM: os_pci_init_handle: invalid context!
Dec 25 14:22:31 Europa kernel: [ 4824.196838] NVRM: os_pci_init_handle: invalid context!
Dec 25 14:22:31 Europa kernel: [ 4824.196840] NVRM: os_pci_init_handle: invalid context!
Dec 25 14:22:31 Europa kernel: [ 4824.196848] NVRM: GPU at 0000:01:00: GPU-9e2ddf6e-47c2-4331-d30f-b7adedcd90fc
Dec 25 14:22:31 Europa kernel: [ 4824.196851] NVRM: Xid (0000:01:00): 8, Channel 00000003
Dec 25 14:22:33 Europa kernel: [ 4826.195819] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context

It did however recover on it's own this time. The latest nvidia driver (331.20) from the ubuntu proposed ppa is installed.

description: updated

M. van Workum, thank you for testing 12.04.03. Just to clarify, you were testing https://launchpad.net/ubuntu/+source/nvidia-graphics-drivers-331 ?

MvW (2nv2u) wrote :

@Christopher, yes, it's that package.
I'm running the other older supplied driver (304.116) now, did encounter a freeze of the desktop (similar to this bug), but there is nothing written in syslog to back it up. Could be the same issue.

MvW (2nv2u) wrote :

Just encountered this with the 304 driver:

Dec 27 16:42:49 Europa kernel: [ 4777.614309] NVRM: Xid (0000:01:00): 8, Channel 00000001
Dec 27 16:42:51 Europa kernel: [ 4779.587058] Watchdog[2714]: segfault at 0 ip 00007ffc5bb91f7e sp 00007ffc4d0fb6c0 error 6 in libcontent.so[7ffc5b469000+f14000]
Dec 27 16:42:51 Europa kernel: [ 4779.613147] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context

tags: added: raring

M. van Workum, thank you for testing Precise. Given Ubuntu 12.04.03 is essentially Raring (has the Precise user-space, but it's 3.8.x kernel and x.org stack), this helps in adding a data point. However, would you mind testing the earlier Precise via http://xubuntu.org/getxubuntu/? This would test the 3.2.x series kernel and original x.org stack, which would help greatly in identifying the root cause.

summary: - NVRM: os_schedule: Attempted to yield the CPU while in atomic or
- interrupt context
+ 10de:061f [Dell Precision M6500] NVRM: os_schedule: Attempted to yield
+ the CPU while in atomic or interrupt context
MvW (2nv2u) wrote :

@Christopher, running Xubuntu 12.04.3 with the nvidia 331.20 for quite some time now, haven't encountered any issues yet!

Uname: 3.2.0-58-generic #88-Ubuntu SMP Tue Dec 3 17:37:58 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

description: updated
affects: linux (Ubuntu) → nvidia-graphics-drivers-331 (Ubuntu)
Changed in nvidia-graphics-drivers-331 (Ubuntu):
importance: High → Low
status: Incomplete → New
MvW (2nv2u) wrote :

Running quite well with kernel 3.2 for a week now and have switched to 3.4 (mainline) 2 days ago, still without any issues.

Uname: 3.4.77-030477-generic #201401151835 SMP Wed Jan 15 23:36:14 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

MvW, thank you for your comment. Please do not mark this report a duplicate of another one, or vice versa. As well, please do not suggest in other people's report (ex. 986831) they should mark themselves affected, as this wouldn't be helpful. Instead, they should file a new report, as previously documented to you.

Despite this, could you please test for this with the latest development release of Ubuntu? ISO images are available from http://cdimage.ubuntu.com/daily-live/current/ .

If it remains an issue, could you please run the following command in the development release from a Terminal (Applications->Accessories->Terminal), as it will automatically gather and attach updated debug information to this report:

apport-collect 1258231

Please ensure you have xdiagnose installed, and that you click the Yes button for attaching additional debugging information. As well, please note given that the information from the prior release is already available, doing this on a release prior to the development one would not be helpful.

Thank you for your understanding.

Helpful bug reporting tips:
https://wiki.ubuntu.com/ReportingBugs

description: updated
Changed in nvidia-graphics-drivers-331 (Ubuntu):
status: New → Incomplete
MvW (2nv2u) wrote :

@ Christopher, sorry, but it seems to me that the report will get more attention when it's more generic and this problem has been an issue on multiple platforms for quite some time.

I tested 14.04 alpha however and the problem persists. I tried the apport command, it says: Package nvidia-graphics-drivers-331 not installed and no hook available, ignoring.
A graphical popup with the title"Updating propblem report" shows "No additional information collected."

The nVidia driver 331.38 is installed but the package is: nvidia-331

Syslog shows just the same as before:

Jan 30 02:42:28 Ganymede kernel: [15210.984768] NVRM: os_pci_init_handle: invalid context!
Jan 30 02:42:28 Ganymede kernel: [15210.984784] NVRM: os_pci_init_handle: invalid context!
Jan 30 02:42:28 Ganymede kernel: [15210.984787] NVRM: os_pci_init_handle: invalid context!
Jan 30 02:42:28 Ganymede kernel: [15210.984790] NVRM: os_pci_init_handle: invalid context!
Jan 30 02:42:28 Ganymede kernel: [15210.984793] NVRM: os_pci_init_handle: invalid context!
Jan 30 02:42:28 Ganymede kernel: [15210.984796] NVRM: os_pci_init_handle: invalid context!
Jan 30 02:42:28 Ganymede kernel: [15210.984799] NVRM: os_pci_init_handle: invalid context!
Jan 30 02:42:28 Ganymede kernel: [15210.984802] NVRM: os_pci_init_handle: invalid context!
Jan 30 02:42:28 Ganymede kernel: [15210.984804] NVRM: os_pci_init_handle: invalid context!
Jan 30 02:42:28 Ganymede kernel: [15210.984807] NVRM: os_pci_init_handle: invalid context!
Jan 30 02:42:28 Ganymede kernel: [15210.984810] NVRM: os_pci_init_handle: invalid context!
Jan 30 02:42:28 Ganymede kernel: [15210.984813] NVRM: os_pci_init_handle: invalid context!
Jan 30 02:42:28 Ganymede kernel: [15210.984816] NVRM: os_pci_init_handle: invalid context!
Jan 30 02:42:28 Ganymede kernel: [15210.984830] NVRM: GPU at 0000:01:00: GPU-9e2ddf6e-47c2-4331-d30f-b7adedcd90fc
Jan 30 02:42:28 Ganymede kernel: [15210.984836] NVRM: Xid (0000:01:00): 8, Channel 00000004
Jan 30 02:42:30 Ganymede kernel: [15212.983871] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context

MvW (2nv2u) wrote :

The worst thing happened, running 3.2.0-58 I encountered the same issue after running it for almost 2 weeks without problems.

Feb 1 00:41:24 Europa kernel: [ 1682.538918] NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Feb 1 00:41:26 Europa kernel: [ 1684.538213] sched: RT throttling activated
Feb 1 00:42:39 Europa kernel: [ 1757.527945] NVRM: os_pci_init_handle: invalid context!
Feb 1 00:42:39 Europa kernel: [ 1757.527949] NVRM: os_pci_init_handle: invalid context!
Feb 1 00:42:39 Europa kernel: [ 1757.527952] NVRM: os_pci_init_handle: invalid context!
Feb 1 00:42:39 Europa kernel: [ 1757.527954] NVRM: os_pci_init_handle: invalid context!
Feb 1 00:42:39 Europa kernel: [ 1757.527956] NVRM: os_pci_init_handle: invalid context!
Feb 1 00:42:39 Europa kernel: [ 1757.527958] NVRM: os_pci_init_handle: invalid context!
Feb 1 00:42:39 Europa kernel: [ 1757.527961] NVRM: os_pci_init_handle: invalid context!
Feb 1 00:42:39 Europa kernel: [ 1757.527963] NVRM: os_pci_init_handle: invalid context!
Feb 1 00:42:39 Europa kernel: [ 1757.527965] NVRM: os_pci_init_handle: invalid context!
Feb 1 00:42:39 Europa kernel: [ 1757.527967] NVRM: os_pci_init_handle: invalid context!
Feb 1 00:42:39 Europa kernel: [ 1757.527969] NVRM: os_pci_init_handle: invalid context!
Feb 1 00:42:39 Europa kernel: [ 1757.527972] NVRM: os_pci_init_handle: invalid context!
Feb 1 00:42:39 Europa kernel: [ 1757.527974] NVRM: os_pci_init_handle: invalid context!
Feb 1 00:42:39 Europa kernel: [ 1757.527982] NVRM: GPU at 0000:01:00: GPU-9e2ddf6e-47c2-4331-d30f-b7adedcd90fc
Feb 1 00:42:39 Europa kernel: [ 1757.527986] NVRM: Xid (0000:01:00): 13, 0001 00000000 00000000 00000000 00000000 00000001

tags: added: trusty
Changed in nvidia-graphics-drivers-331 (Ubuntu):
status: Incomplete → New
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nvidia-graphics-drivers-331 (Ubuntu):
status: New → Confirmed
Download full text (9.6 KiB)

I am hitting a similar problem:

oblong@displaywall-left:~$ more /etc/issue
Ubuntu 12.04.3 LTS \n \l

oblong@displaywall-left:~$ uname -a
Linux displaywall-left 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 16:19:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

oblong@displaywall-left:~$ grep Driver /var/log/Xorg.0.log
[ 14.980] (II) NVIDIA dlloader X Driver 331.38 Wed Jan 8 18:51:00 PST 2014

The machine has three nVidia K5000 cards

When I exit my application, X freezes and the following messages are dumped into /var/log/syslog:

Mar 11 18:51:46 displaywall-left kernel: [ 9283.034123] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.034132] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.034135] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.034138] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.034141] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.034144] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.034147] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.034150] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.034153] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.034155] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.034158] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.034161] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.034182] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.034190] NVRM: GPU at 0000:02:00: GPU-d94bdba3-b53b-db31-e87c-b284bea2b29c
Mar 11 18:51:46 displaywall-left kernel: [ 9283.034196] NVRM: Xid (0000:02:00): 31, Ch 00000003, engmask 00000111, intr 10000000
Mar 11 18:51:46 displaywall-left kernel: [ 9283.035525] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.035530] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.035533] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.035536] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.035539] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.035542] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.035544] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.035547] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.035550] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.035553] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-left kernel: [ 9283.035556] NVRM: os_pci_init_handle: invalid context!
Mar 11 18:51:46 displaywall-...

Read more...

Kevin Mullican, thank you for your comment. So your hardware and problem may be tracked, could you please file a new report by executing the following in a terminal:
ubuntu-bug xorg

Please ensure you have xdiagnose installed, and that you click the Yes button for attaching additional debugging information.

For more on this, please see the official Ubuntu documentation:
Ubuntu X.Org Team, Ubuntu Bug Control, and Ubuntu Bug Squad: https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue
Ubuntu Community: https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette

When opening up the new report, please feel free to subscribe me to it.

Please note, not filing a new report will delay your problem being addressed as quickly as possible.

Thank you for your understanding.

MvW (2nv2u) wrote :

For what it's worth (since nobody seems to be bothered doing something with it), this issue still happens with the daily build of trusty tahr (14.04).

MvW (2nv2u) wrote :

Another update, 337.19 still suffers from this issue.

Have been running the nouveau driver without any problems on the laptop itself.
Choppy 3D and not being able to play games is a bummer, but not getting the external monitors on the docking station to work makes it a no go. I still have to rely on the proprietary nvidia driver unfortunately.

MvW (2nv2u) wrote :

For what it's worth I still encounter this error although way less frequent.
Running 15.10 now with the 340.96 driver.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers