Conflict between nvidia-glx and openafs-modules kernel module

Bug #294527 reported by Karl M. Davis
2
Affects Status Importance Assigned to Milestone
OpenAFS
Invalid
Undecided
Unassigned
nvidia-graphics-drivers-177 (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

I just performed a clean install on my system of Ibex the other night. I'd been having some random crashes with Hardy and it was originally a Feisty install so I figured it was time to start from scratch. First thing I got setup was the nvidia-glx-177 goodness, because I like shiny things. I got that working and configured correctly for my dual-monitor setup, using TwinView.

After doing that, I installed the openafs client and its associated kernel modules (built via module-assistant from openafs-modules-source). I store most of my home folder on an AFS server here at home. After installing the modules package, I rebooted. For some reason, X broke and went into failsafe mode, for no apparent reason.

It seems that there's a conflict between the nvidia driver and the openafs kernel modules. I know that sounds screwy as hell, and it is. I actually didn't even catch the connection until I gave up on getting it working and did another clean install today. This time, I went through everything a lot slower and sure enough, as soon as I installed the openafs-modules package, X started breaking. Uninstalled it, rebooted, and now X is starting up again just fine.

I'll go ahead and install it again and post back to this bug with the Xorg.0.log file from when it's crashing.

[lspci]
e0000000-e3ffffff : 0000:00:00.0
01:00.0 VGA compatible controller: nVidia Corporation G73 [GeForce 7600 GS] (rev a2)
     Subsystem: Giga-byte Technology Device 3428
[ 0.508279] PCI: 0000:00:00.0 reg 10 32bit mmio: [e0000000, e3ffffff]
[ 11.885122] agpgart-intel 0000:00:00.0: Intel 830M Chipset
[ 11.889130] agpgart-intel 0000:00:00.0: AGP aperture is 64M @ 0xe0000000

Revision history for this message
Karl M. Davis (karlmdavis) wrote :

Here's the log file with the following errors:
<<
...
(EE) NVIDIA(0): Failed to initialize the NVIDIA graphics device PCI:1:0:0.
(EE) NVIDIA(0): Please see the COMMON PROBLEMS section in the README for
(EE) NVIDIA(0): additional information.
(EE) NVIDIA(0): Failed to initialize the NVIDIA graphics device!
...
(EE) Screen(s) found, but none have a usable configuration.
>>

Revision history for this message
Alberto Milone (albertomilone) wrote :

Please type the following command:
sudo nvidia-bug-report.sh

and attach the nvidia-bug-report.log that will be created in the same directory in which you type the command.

Revision history for this message
Karl M. Davis (karlmdavis) wrote :

Here's the nvidia report from when the openafs module is not installed and X is working. Please let me know if you need me to run this again with the openafs module causing a conflict.

Revision history for this message
Alberto Milone (albertomilone) wrote :

yes please attach the report with the openafs module.

Revision history for this message
Karl M. Davis (karlmdavis) wrote :

Here's the nvidia report from when the openafs module _is_ installed and X is _not_ working.

Revision history for this message
Alberto Milone (albertomilone) wrote :

If possible, configure your system such that your graphics card does not share its IRQ with other devices.

Check your bios settings and let the OS manage the IRQs. For example if your BIOS has an option for "Plug and Play Operating System", or similar, and if it is set to 'Yes' or 'Enabled' then set it to 'No' or 'Disabled'.

If this doesn't work, try booting into Ubuntu with one of (or a combination of) these kernel parameters:
Parameter Behavior
pci=noacpi don't use ACPI for PCI IRQ routing
pci=biosirq use PCI BIOS calls to retrieve the IRQ routing table
noapic don't use I/O APICs present in the system
acpi=off disable ACPI

Revision history for this message
Karl M. Davis (karlmdavis) wrote :

Unfortunately, my BIOS does not have any such "Plug and Play OS" setting. Each of my PCI slots is currently set to "PNP". though. If I'd like, I could go through and manually assign IRQs to each of the them.

I tried the following kernel parameter combinations with no success:
1. pci=noacpi
2. pci=biosirq
3. noapic acpi=off

Revision history for this message
Karl M. Davis (karlmdavis) wrote :

I have cross-posted this issue on the openafs-info mailing list:
http://lists.openafs.org/pipermail/openafs-info/2008-November/030339.html

Revision history for this message
Bryce Harrington (bryce) wrote :

[This is an automated message]

Hi karl-davisonlinehome,

Please attach the output of `lspci -vvnn` too.

Changed in nvidia-graphics-drivers-177:
status: New → Incomplete
Bryce Harrington (bryce)
description: updated
Revision history for this message
Bryce Harrington (bryce) wrote :

[This is an automated message]

In Jaunty (9.04), we are dropping the nvidia-graphics-drivers-177 package, which
is now superseded by then nvidia-graphics-drivers-180 package.

  http://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-180

To transition your bug into the new package, we need your help. Please do the following:

 a. Verify the bug occurs in Jaunty with the -180 driver
     (ISOs: http://cdimage.ubuntu.com/releases/jaunty/)
 b. If you haven't already, please include in the bug:
     * Your /var/log/Xorg.0.log
     * Your /etc/X11/xorg.conf
     * The output of `lspci -vvnn`
     * Steps to reproduce the issue
 c. Under Affects, click the down arrow to the left of 'nvidia-graphics-drivers-177'
 d. Edit the Package to change 'nvidia-graphics-drivers-177' to 'nvidia-graphics-drivers-180'
 e. Click Save Changes

Thank you!

[We'll expire the remaining -177 bugs in a month or so.]

Revision history for this message
Karl M. Davis (karlmdavis) wrote :

This issue was diagnosed on the openafs thread I posted earlier. Looks like it's a kernel memory issue that crops up on some systems (my old desktop, for one) when the openafs client cache options in /etc/openafs/afs.conf are set as follows:
OPTIONS="-chunksize 20 -memcache -blocks 65536"

Reverting the cache settings back to the default resolves the problem:
OPTIONS=AUTOMATIC

Unfortunately, I won't be able to verify this bug against the new Jaunty package anytime soon as I've since replaced my old NVIDIA desktop with an ATI one. I would go ahead and close this bug but it would probably be helpful to others googling for help with similar symptoms if it was archived (I'm not sure what the Launchpad archival policy is for closed bugs).

Revision history for this message
Bryce Harrington (bryce) wrote :

[This is an automated message]

In Jaunty (9.04) we have dropped the nvidia-graphics-driver-177 package, as
it is superseded by nvidia-graphics-drivers-180.

Because of this, we are closing this bug as expired.

You are encouraged to file your -nvidia bug reports against the
nvidia-graphics-drivers-180 package.

Changed in nvidia-graphics-drivers-177:
status: Incomplete → Invalid
Revision history for this message
Robert Gerlach (khnz) wrote :
Changed in openafs:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.