Loading the nvidia driver causes kernel oops in maverick, natty

Bug #607399 reported by Anders Kaseorg on 2010-07-19
50
This bug affects 8 people
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers (Ubuntu)
Medium
Unassigned
Maverick
Medium
Unassigned

Bug Description

I get this kernel oops every time I start X with the nvidia driver enabled. I think this started happening with the upgrade from 2.6.35-5-generic to 2.6.35-6-generic.

ProblemType: KernelOops
DistroRelease: Ubuntu 10.10
Package: linux-image-2.6.35-9-generic 2.6.35-9.14
Regression: Yes
Reproducible: Yes
ProcVersionSignature: Ubuntu 2.6.35-9.14-generic 2.6.35-rc5
Uname: Linux 2.6.35-9-generic x86_64
NonfreeKernelModules: openafs nvidia wl
AcpiTables: Error: [Errno 13] Permission denied
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.23.
Annotation: Your system might become unstable now and might need to be restarted.
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: anders 3056 F.... pulseaudio
 /dev/snd/seq: timidity 2191 F.... timidity
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xd0500000 irq 43'
   Mixer name : 'Analog Devices AD1986A'
   Components : 'HDA:11d41986,17aa2066,00100500'
   Controls : 20
   Simple ctrls : 11
Date: Mon Jul 19 14:50:25 2010
EcryptfsInUse: Yes
Failure: oops
HibernationDevice: RESUME=UUID=7d312e76-f5e0-480f-bd85-6cd5aade2e74
InstallationMedia: Ubuntu 9.10 "Karmic Koala" - Release Candidate amd64 (20091020.3)
MachineType: LENOVO 0768AJU
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcCmdLine: BOOT_IMAGE=/vmlinuz-2.6.35-9-generic root=/dev/mapper/btree-ubuntu ro single i8042.reset cgroup_disable=memory apparmor=0
RelatedPackageVersions: linux-firmware 1.37
RfKill:
 0: hci0: Bluetooth
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
Title: fa128ade3
dmi.bios.date: 06/04/07
dmi.bios.vendor: LENOVO
dmi.bios.version: 61ET37WW
dmi.board.name: CAPELL VALLEY(NAPA) CRB
dmi.board.vendor: LENOVO
dmi.board.version: Not Applicable
dmi.chassis.type: 10
dmi.chassis.vendor: No Enclosure
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnLENOVO:bvr61ET37WW:bd06/04/07:svnLENOVO:pn0768AJU:pvr3000N100:rvnLENOVO:rnCAPELLVALLEY(NAPA)CRB:rvrNotApplicable:cvnNoEnclosure:ct10:cvrN/A:
dmi.product.name: 0768AJU
dmi.product.version: 3000 N100
dmi.sys.vendor: LENOVO

== Regression details ==
Discovered in version: linux 2.6.35-6-generic.
Last known good version:linux 2.6.35-5-generic

Anders Kaseorg (andersk) wrote :
Anders Kaseorg (andersk) wrote :
Download full text (3.4 KiB)

This still hasn’t gone away with nvidia-current 256.44-0ubuntu1, nor 256.52-0ubuntu0sarvatt (from xorg-edgers). I can’t use the nvidia driver unless I boot into kernel 2.6.32.

Oops while starting X with nvidia 256.52 on kernel 2.6.35-19-generic:

BUG: unable to handle kernel paging request at ffffffffa111b68d
IP: [<ffffffffa05d7778>] _nv025668rm+0x44/0x176 [nvidia]
PGD 1a2c067 PUD 1a30063 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:05:06.3/local_cpus
CPU 0
Modules linked in: kvm_intel kvm microcode nvidia(P) snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi lib80211_crypt_tkip wl(P) snd_seq_midi_event r852 sm_common snd_seq joydev snd_timer snd_seq_device nand nand_ids nand_ecc pcmcia mtd snd lib80211 btusb coretemp bluetooth yenta_socket pcmcia_rsrc pcmcia_core video output soundcore snd_page_alloc psmouse serio_raw intel_agp lp parport 8139too btrfs zlib_deflate 8139cp crc32c sdhci_pci sdhci led_class firewire_ohci firewire_core crc_itu_t mii libcrc32c

Pid: 1068, comm: Xorg Tainted: P 2.6.35-19-generic #28-Ubuntu CAPELL VALLEY(NAPA) CRB/0768AJU
RIP: 0010:[<ffffffffa05d7778>] [<ffffffffa05d7778>] _nv025668rm+0x44/0x176 [nvidia]
RSP: 0018:ffff88006ef85ab8 EFLAGS: 00010282
RAX: ffff88006bad4000 RBX: ffff88006bbdc000 RCX: 0000000000000001
RDX: ffff88006bad4000 RSI: 0000000000000016 RDI: ffff88006fbbe000
RBP: ffff88006bb0af68 R08: ffff88006e5a0000 R09: ffff88006e16f200
R10: 00000000ffffffff R11: 0000000000000077 R12: ffff88006bbee000
R13: ffff88006fbbe000 R14: ffff88006bad4000 R15: ffff88006fda5400
FS: 00007fb23db89840(0000) GS:ffff880001e00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffa111b68d CR3: 000000006ed16000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process Xorg (pid: 1068, threadinfo ffff88006ef84000, task ffff88006ef88000)
Stack:
 ffffc9001091e000 ffff88006fbbe000 ffff88006bb0afe8 ffff88006bbee000
<0> ffffc9001091e000 ffffffffa092e692 ffff88007b2cf400 ffff88006fda5400
<0> ffff88006fbbe000 ffffc9001091e000 ffff880037ad4800 ffffffffa092f649
Call Trace:
 [<ffffffffa092e692>] ? _nv002105rm+0x211/0x254 [nvidia]
 [<ffffffffa092f649>] ? _nv002098rm+0x407/0x65b [nvidia]
 [<ffffffffa09350a3>] ? rm_init_adapter+0x83/0xf1 [nvidia]
 [<ffffffffa095234e>] ? nv_kern_open+0x5ae/0x750 [nvidia]
 [<ffffffff8115565a>] ? chrdev_open+0x10a/0x200
 [<ffffffff81155550>] ? chrdev_open+0x0/0x200
 [<ffffffff8114fab5>] ? __dentry_open+0xe5/0x330
 [<ffffffff8125f45f>] ? security_inode_permission+0x1f/0x30
 [<ffffffff8114fe14>] ? nameidata_to_filp+0x54/0x70
 [<ffffffff8115c9c8>] ? finish_open+0xe8/0x1d0
 [<ffffffff8116564f>] ? dput+0xdf/0x1b0
 [<ffffffff8115de26>] ? do_last+0x86/0x460
 [<ffffffff8116015b>] ? do_filp_open+0x21b/0x660
 [<ffffffff8116b96a>] ? alloc_fd+0x10a/0x150
 [<ffffffff8114f859>] ? do_sys_open+0x69/0x170
 [<ffffffff8114f9a0>] ? sys_open+0x20/0x30
 [<ffffffff8100a0f2>] ? system_call_fastpath+0x16/0x1b
Code: 00 ba 00 00 00 00 be 3d 00 00 00 41 ff 55 20 48 89 c3 b9 01 00 00 00 ba 00 00 00 0...

Read more...

jsgaarde (jakob-simon-gaarde) wrote :

I have the same problem, and I have tried for 2 days to get it working - I'm also using Geforce GT 230m. Does anyone have a clue on this? It seems like we have a whole series og nvidia laptops that can't currently use Ubuntu. When you install ubuntu on a laptop with that this videocard you run into not only one newbie killer but 2:
1. Blank screen because of the nouveau-kms module.
2. If you get past that and install the propriatary drivers to get 3D acceleration you run into a kernel oops.
I'm not a hardware-guy so I'm pretty lost when it comes to fix this.

BTW I had my laptop running with some tweaks on Lucid.

/ Jakob

Anders Kaseorg (andersk) wrote :

Well, I reported the oops to the NVIDIA Linux forum <http://www.nvnews.net/vbulletin/showthread.php?t=155728> with a log generated by nvidia-bug-report.sh as requested, but I haven’t gotten a response. If you’re seeing the same problem, perhaps replying to that thread would be helpful.

jsgaarde (jakob-simon-gaarde) wrote :
Download full text (13.5 KiB)

Just for the record, here are my results from testing nvidia's drivers 260.19.06 and 260.19.12:

Version 260.19.06
===============

First I installed the 260.19.06 drivers via jockey, restarted my system and got the following kernel oops:

<kern.log>
--------------
Oct 18 11:03:53 jakob-vaio kernel: [ 13.945646] hda_intel: Disable MSI for Nvidia chipset
Oct 18 11:03:53 jakob-vaio kernel: [ 13.945687] HDA Intel 0000:01:00.1: setting latency timer to 64
Oct 18 11:03:53 jakob-vaio kernel: [ 14.067035] nvidia: module license 'NVIDIA' taints kernel.
Oct 18 11:03:53 jakob-vaio kernel: [ 14.067039] Disabling lock debugging due to kernel taint
Oct 18 11:03:53 jakob-vaio kernel: [ 14.440161] Synaptics Touchpad, model: 1, fw: 7.2, id: 0x1c0b1, caps: 0xd04731/0xa40000/0xa0000
Oct 18 11:03:53 jakob-vaio kernel: [ 14.487866] input: SynPS/2 Synaptics TouchPad as /devices/platform/i8042/serio2/input/input7
Oct 18 11:03:53 jakob-vaio kernel: [ 14.847556] EXT4-fs (sda6): re-mounted. Opts: errors=remount-ro
Oct 18 11:03:53 jakob-vaio kernel: [ 15.081745] nvidia 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Oct 18 11:03:53 jakob-vaio kernel: [ 15.081760] nvidia 0000:01:00.0: setting latency timer to 64
Oct 18 11:03:53 jakob-vaio kernel: [ 15.081765] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
Oct 18 11:03:53 jakob-vaio kernel: [ 15.081895] NVRM: loading NVIDIA UNIX x86 Kernel Module 260.19.06 Mon Sep 13 06:35:06 PDT 2010
Oct 18 11:03:53 jakob-vaio kernel: [ 15.338211] type=1400 audit(1287392633.175:5): apparmor="STATUS" operation="profile_replace" name="/sbin/dhclient3" pid=982 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.338269] type=1400 audit(1287392633.175:6): apparmor="STATUS" operation="profile_load" name="/usr/lib/cups/backend/cups-pdf" pid=983 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.338895] type=1400 audit(1287392633.175:7): apparmor="STATUS" operation="profile_replace" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=982 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.339100] type=1400 audit(1287392633.175:8): apparmor="STATUS" operation="profile_load" name="/usr/sbin/cupsd" pid=983 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.339270] type=1400 audit(1287392633.175:9): apparmor="STATUS" operation="profile_replace" name="/usr/lib/connman/scripts/dhclient-script" pid=982 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.340780] type=1400 audit(1287392633.179:10): apparmor="STATUS" operation="profile_load" name="/usr/sbin/mysqld-akonadi" pid=984 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.341385] type=1400 audit(1287392633.179:11): apparmor="STATUS" operation="profile_load" name="/usr/sbin/tcpdump" pid=985 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.776733] ppdev: user-space parallel port driver
Oct 18 11:03:53 jakob-vaio kernel: [ 15.791671] sky2 0000:02:00.0: eth0: enabling interface
Oct 18 11:03:53 jakob-vaio kernel: [ 15.792931] ADDRCONF(NETDEV_UP): eth0: link is not ready
Oct 18 11:03:53 jakob-vaio kernel: [ 15.94858...

jsgaarde (jakob-simon-gaarde) wrote :
Download full text (13.5 KiB)

Just for the record, here are my results from testing nvidia's drivers 260.19.06 and 260.19.12:

Version 260.19.06
===============

First I installed the 260.19.06 drivers via jockey, restarted my system and got the following kernel oops:

<pre>
<kern.log>
--------------
Oct 18 11:03:53 jakob-vaio kernel: [ 13.945646] hda_intel: Disable MSI for Nvidia chipset
Oct 18 11:03:53 jakob-vaio kernel: [ 13.945687] HDA Intel 0000:01:00.1: setting latency timer to 64
Oct 18 11:03:53 jakob-vaio kernel: [ 14.067035] nvidia: module license 'NVIDIA' taints kernel.
Oct 18 11:03:53 jakob-vaio kernel: [ 14.067039] Disabling lock debugging due to kernel taint
Oct 18 11:03:53 jakob-vaio kernel: [ 14.440161] Synaptics Touchpad, model: 1, fw: 7.2, id: 0x1c0b1, caps: 0xd04731/0xa40000/0xa0000
Oct 18 11:03:53 jakob-vaio kernel: [ 14.487866] input: SynPS/2 Synaptics TouchPad as /devices/platform/i8042/serio2/input/input7
Oct 18 11:03:53 jakob-vaio kernel: [ 14.847556] EXT4-fs (sda6): re-mounted. Opts: errors=remount-ro
Oct 18 11:03:53 jakob-vaio kernel: [ 15.081745] nvidia 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Oct 18 11:03:53 jakob-vaio kernel: [ 15.081760] nvidia 0000:01:00.0: setting latency timer to 64
Oct 18 11:03:53 jakob-vaio kernel: [ 15.081765] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
Oct 18 11:03:53 jakob-vaio kernel: [ 15.081895] NVRM: loading NVIDIA UNIX x86 Kernel Module 260.19.06 Mon Sep 13 06:35:06 PDT 2010
Oct 18 11:03:53 jakob-vaio kernel: [ 15.338211] type=1400 audit(1287392633.175:5): apparmor="STATUS" operation="profile_replace" name="/sbin/dhclient3" pid=982 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.338269] type=1400 audit(1287392633.175:6): apparmor="STATUS" operation="profile_load" name="/usr/lib/cups/backend/cups-pdf" pid=983 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.338895] type=1400 audit(1287392633.175:7): apparmor="STATUS" operation="profile_replace" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=982 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.339100] type=1400 audit(1287392633.175:8): apparmor="STATUS" operation="profile_load" name="/usr/sbin/cupsd" pid=983 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.339270] type=1400 audit(1287392633.175:9): apparmor="STATUS" operation="profile_replace" name="/usr/lib/connman/scripts/dhclient-script" pid=982 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.340780] type=1400 audit(1287392633.179:10): apparmor="STATUS" operation="profile_load" name="/usr/sbin/mysqld-akonadi" pid=984 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.341385] type=1400 audit(1287392633.179:11): apparmor="STATUS" operation="profile_load" name="/usr/sbin/tcpdump" pid=985 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.776733] ppdev: user-space parallel port driver
Oct 18 11:03:53 jakob-vaio kernel: [ 15.791671] sky2 0000:02:00.0: eth0: enabling interface
Oct 18 11:03:53 jakob-vaio kernel: [ 15.792931] ADDRCONF(NETDEV_UP): eth0: link is not ready
Oct 18 11:03:53 jakob-vaio kernel: [ 15...

jsgaarde (jakob-simon-gaarde) wrote :
Download full text (13.5 KiB)

Just for the record, here are my results from testing nvidia's drivers 260.19.06 and 260.19.12:

Version 260.19.06
===============

First I installed the 260.19.06 drivers via jockey, restarted my system and got the following kernel oops:

<pre>
[kern.log]
--------------
Oct 18 11:03:53 jakob-vaio kernel: [ 13.945646] hda_intel: Disable MSI for Nvidia chipset
Oct 18 11:03:53 jakob-vaio kernel: [ 13.945687] HDA Intel 0000:01:00.1: setting latency timer to 64
Oct 18 11:03:53 jakob-vaio kernel: [ 14.067035] nvidia: module license 'NVIDIA' taints kernel.
Oct 18 11:03:53 jakob-vaio kernel: [ 14.067039] Disabling lock debugging due to kernel taint
Oct 18 11:03:53 jakob-vaio kernel: [ 14.440161] Synaptics Touchpad, model: 1, fw: 7.2, id: 0x1c0b1, caps: 0xd04731/0xa40000/0xa0000
Oct 18 11:03:53 jakob-vaio kernel: [ 14.487866] input: SynPS/2 Synaptics TouchPad as /devices/platform/i8042/serio2/input/input7
Oct 18 11:03:53 jakob-vaio kernel: [ 14.847556] EXT4-fs (sda6): re-mounted. Opts: errors=remount-ro
Oct 18 11:03:53 jakob-vaio kernel: [ 15.081745] nvidia 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Oct 18 11:03:53 jakob-vaio kernel: [ 15.081760] nvidia 0000:01:00.0: setting latency timer to 64
Oct 18 11:03:53 jakob-vaio kernel: [ 15.081765] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
Oct 18 11:03:53 jakob-vaio kernel: [ 15.081895] NVRM: loading NVIDIA UNIX x86 Kernel Module 260.19.06 Mon Sep 13 06:35:06 PDT 2010
Oct 18 11:03:53 jakob-vaio kernel: [ 15.338211] type=1400 audit(1287392633.175:5): apparmor="STATUS" operation="profile_replace" name="/sbin/dhclient3" pid=982 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.338269] type=1400 audit(1287392633.175:6): apparmor="STATUS" operation="profile_load" name="/usr/lib/cups/backend/cups-pdf" pid=983 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.338895] type=1400 audit(1287392633.175:7): apparmor="STATUS" operation="profile_replace" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=982 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.339100] type=1400 audit(1287392633.175:8): apparmor="STATUS" operation="profile_load" name="/usr/sbin/cupsd" pid=983 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.339270] type=1400 audit(1287392633.175:9): apparmor="STATUS" operation="profile_replace" name="/usr/lib/connman/scripts/dhclient-script" pid=982 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.340780] type=1400 audit(1287392633.179:10): apparmor="STATUS" operation="profile_load" name="/usr/sbin/mysqld-akonadi" pid=984 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.341385] type=1400 audit(1287392633.179:11): apparmor="STATUS" operation="profile_load" name="/usr/sbin/tcpdump" pid=985 comm="apparmor_parser"
Oct 18 11:03:53 jakob-vaio kernel: [ 15.776733] ppdev: user-space parallel port driver
Oct 18 11:03:53 jakob-vaio kernel: [ 15.791671] sky2 0000:02:00.0: eth0: enabling interface
Oct 18 11:03:53 jakob-vaio kernel: [ 15.792931] ADDRCONF(NETDEV_UP): eth0: link is not ready
Oct 18 11:03:53 jakob-vaio kernel: [ 15...

jsgaarde (jakob-simon-gaarde) wrote :

Just for the record, here are my results from testing nvidia's drivers 260.19.06 and 260.19.12:

Version 260.19.06
===============

First I installed the 260.19.06 drivers via jockey, restarted my system and got the following kernel oops: (see attachments: kern.log_260.19.06)

I uninstalled the 260.19.06 driver by doing an apt-get remove nvidia-* --purge.

Version 260.19.12
===============

Then I tried the newest drivers from Nvidia by downloading 260.19.12 from their site. I tried to start X and got the following error: (see attachments: messages_260.19.12,kern.log_260.19.12, nvidia-bug-report.log_260.19.12):

Best regards Jakob

jsgaarde (jakob-simon-gaarde) wrote :
jsgaarde (jakob-simon-gaarde) wrote :
jsgaarde (jakob-simon-gaarde) wrote :
Anders Kaseorg (andersk) wrote :

jsgaarde: It looks like your problem is different. You’re getting

[ 48.011528] vmap allocation for size 16781312 failed: use vmalloc=<size> to increase size.
[ 48.014897] NVRM: RmInitAdapter failed! (0x26:0xffffffff:1028)
[ 48.014904] NVRM: rm_init_adapter(0) failed

which is a known issue:
http://us.download.nvidia.com/XFree86/Linux-x86/260.19.12/README/knownissues.html#kva_exhaustion

Try one of the solutions listed in that documtation, and if that doesn’t work, you should file a new bug report. This report is about a crash in the kernel module (BUG: unable to handle kernel paging request at ffffffffa111b68d).

jsgaarde (jakob-simon-gaarde) wrote :

Thanks for your feedback Anders. I tried to raise to vmalloc 256MB which made the kernel virtual address space exhaustion error go away - unfortunately it brought the kernel oops back :-( So the kernel oops had only been absent because the driver never got loaded completely...

I attached the kern.log file for the record.

/ Jakob

Anders Kaseorg (andersk) wrote :

That’s indeed a kernel oops, though it is a NULL dereference rather than a page fault. Unfortunately your Call Trace got cut off so it’s difficult to say whether it’s related. Did you hard-reboot immediately after the crash? Next time, wait a few seconds and press Alt-SysRq-s to give the kernel a chance to sync its filesystems to disk [1] before rebooting.

[1] http://en.wikipedia.org/wiki/Magic_SysRq_key

jsgaarde (jakob-simon-gaarde) wrote :

Well that certainly gave more info :-) I have attached the new kern.log.

Is it still an issue with the final release of maverick ?

Changed in nvidia-graphics-drivers (Ubuntu Maverick):
status: New → Confirmed
tags: added: regression-release
removed: regression-potential
Changed in nvidia-graphics-drivers (Ubuntu):
status: New → Incomplete
description: updated
jsgaarde (jakob-simon-gaarde) wrote :

Yes it is, I have only tried with the release version. It is also still an issue with the latest kernel update that came a few days ago.

Thank you for following up. Setting status to 'confirmed'.

Changed in nvidia-graphics-drivers (Ubuntu):
importance: Undecided → Medium
status: Incomplete → Confirmed
Changed in nvidia-graphics-drivers (Ubuntu Maverick):
importance: Undecided → Medium
jsgaarde (jakob-simon-gaarde) wrote :

I'm a little curious about what will happen with this bug now? Do you need more info?

jsgaarde, the regression is confirmed in maverick we now need a developer to look at it.

Anders Kaseorg (andersk) wrote :

The problem persists with natty kernel 2.6.37-4-generic and nvidia-current 260.19.06-0ubuntu1.

summary: - Loading the nvidia driver causes kernel oops in maverick
+ Loading the nvidia driver causes kernel oops in maverick, natty
jsgaarde (jakob-simon-gaarde) wrote :

I was wondering if there is any news on this bug? 3 more months more months in vesa mode is almost unbarable... I miss starcraft 2 :-(

jsgaarde (jakob-simon-gaarde) wrote :

Nevermind my previous comment. I tried installing nvidia's newest driver 260.19.36. It worked perfectly!

Best regards Jakob Simon-Gaarde

Languid (niels-widger) wrote :

I am still encountering the 'BUG: unable to handle kernel paging request at ffffffffa0e364f1' oops with maverick kernel 2.6.35-25-generic and the latest NVIDIA driver 260.19.36.

Languid (niels-widger) wrote :

Upgraded to NVIDIA driver 270.18 from ppa:ubuntu-x-swat/x-updates and still encountering the kernel oops. This bug is incredibly frustrating, all I have been able to do is reboot until the NVIDIA module loads. Sometimes I have to reboot my machine 10 or 20 times repeatedly. Does anyone have a solution or way around this problem?

tbp (tbp) wrote :

Bitten. Wasted my weekend on it.

I've been running various kernels & nvidia drivers for a long time, even switching hardware along the way, without any issue whatsoever and out of the blue i get that oops. The funniest part is i cannot get rid of it: i've tried every available kernels (from 2.6.32 up to 2.6.37) and drivers (from 195.* up to 270.*), then downgraded every remotely related packages (xorg*, libs and more), yet it still oops. It's infuriating.

As i knew my hardware & system were just fine, out of desperation, i've given a fresh ubuntu 10.10 live cd a try... booted, jockeyed and no fuss, no muss, got a working 260.19.06. Revived by such wondrous sight, i've exported its magic list of package/version and applied that (well, part of it) to my dysfunctional system. No luck, still oopsing. Noticed that now almost similar systems produced different binaries. So i went medieval and simply grafted the live-cd's module binary. Success.

Conclusion: apparently some maverick update screws the build system for that module, even if the exact reason/culprit still escapes me right now (and i have downgraded just about everything at this point...).

TL; DR: for a quick fix, use a fresh live cd to produce a sane module.

MyR (myr-jedi) on 2011-05-03
tags: added: i386 natty
Axel (naxel) wrote :

Hi tbp,

You fix worked for me, too (11.04 64bit). Then I realized that I had gold as the default linker. Before carrying out the next kernel update I put the bsd ld back as default (simple symlink rewiring), ran the update - now the nvidia modules were linked using bsd ld -, bravely rebooted into the new kernel and was greeted by a friendly gdm!

Can anyone confirm?

Axel.

Axel wrote :

""""Then I realized that I had gold as the default linker. Before carrying out the next kernel update I put the bsd ld back as default (simple symlink rewiring), ran the update - now the nvidia modules were linked using bsd ld -, bravely rebooted into the new kernel and was greeted by a friendly gdm!

Can anyone confirm?""""

YES! You cracked it! I can confirm that the problem seems to be related to the "gold" linker. Since the Maverick upgrade I had kernel oops every single time XOrg tried to start. The nvidia moduleee loadeds fine, it seems to be only when the X server loads does the OOPS occiur.

Following from what you said, I changed the ld soft link to /usr/bin/ld.bfd , reinstalled the nVidia driver, and hey presto it works!

Now, the question has to be... what is the "gold" linker doing to the nVidia source that "ld.bfd" is not?

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nvidia-graphics-drivers - 285.05.09-0ubuntu1

---------------
nvidia-graphics-drivers (285.05.09-0ubuntu1) precise; urgency=low

  * New upstream release:
    - Added support for the following GPU:
      o GeForce GT 520MX
    - Added support for xserver ABI 11 (xorg-server 1.11).
    - Fixed a bug causing a Linux kernel BUG when retrieving
      CPU information on some systems.
    - Fixed a bug causing some applications to hang on exit.
    - Fixed a bug causing flickering in some GPU/display
      combinations.
    - Fixed a bug that could result in poor OpenGL performance
      after hotplugging a monitor.
    - Fixed a bug causing possible text corruption when
      recovering from GPU errors.
  * debian/dkms.conf{.in}:
    - Make sure that dkms doesn't use the gold linker,
      otherwise the resulting module would cause kernel
      oops (LP: #607399).
    - Drop non functional OBSOLETE_BY variable.
 -- Alberto Milone <email address hidden> Tue, 08 Nov 2011 16:32:33 +0100

Changed in nvidia-graphics-drivers (Ubuntu):
status: Confirmed → Fix Released
Rolf Leggewie (r0lf) wrote :

maverick has seen the end of its life and is no longer receiving any updates. Marking the maverick task for this ticket as "Won't Fix".

Changed in nvidia-graphics-drivers (Ubuntu Maverick):
status: Confirmed → Won't Fix
To post a comment you must log in.