[nvidia-glx] Nvidia binary driver causes progressive breakage

Bug #107947 reported by kripken
2
Affects Status Importance Assigned to Milestone
linux-restricted-modules-2.6.20 (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Binary package hint: nvidia-glx

The "nv" driver works fine, but after installing - via the Restricted Drivers Manager in Feisty - the binary NVidia driver, progressive breakage occurs, as follows. Note: this is on a fresh install of Feisty, on a formatted partition.

1. After the intial restart, things are fine. glx-gears and a 3D FPS game work great.
2. After another restart, the resolution is no longer detected correctly (even though it was before!). It is now 800x600 (it should be 1024x768), and cannot be set higher in the system->Preferences->screen resolution
3. After yet another restart, the resolution is now set to 640x480 (and again, cannot be modified in the preferences), and what is much worse, metacity is not loaded. By chance I recognized the symptoms of this, and knew that I should run 'metacity' in a terminal. This fixes it - but it needs to be done on every time the system is booted! NOTE: I did *not* do anything with compiz/beryl/desktop-effects, nothing like that. I can see no reason that suddenly metacity would not be loaded by default.

Dis-enabling the NVidia binary driver in the Restricted Drivers Manager and restarting gets a working system - correct resolution, working metacity (but of course the 3D FPS game is unplayable).

My hardware is an FX5200. Note: This computer has worked with 5.10, 6.06, 6.10 with the NVidia binary driver with no problems.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Thank you for your bug report.

kripkenstein:
Can you include the output of the following commands
lspci | grep -i nv
dpkg -l nvidia\* | grep ii

and upload
/var/log/Xorg.0.log
when the screen resolution is too small.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Setting to needsinfo pedning reply from kripkenstein

Changed in linux-restricted-modules-2.6.20:
status: Unconfirmed → Needs Info
Revision history for this message
kripken (kripkenstein) wrote :

Ok, here is some more data.

lspci | grep -i nv
===>
01:00.0 VGA compatible controller: nVidia Corporation NV34 [GeForce FX 5200] (rev a1)

dpkg -l nvidia\* | grep ii
===>
ii nvidia-glx 1.0.9631+2.6.20.5-15.20 NVIDIA binary XFree86 4.x/X.Org driver
ii nvidia-kernel-common 20051028+1ubuntu7 NVIDIA binary kernel module common files

(nvidia-glx is of present only when the binary driver is enabled).

Here is the /var/log/Xorg.0.log when using "nv", that is, when everything works ok (for comparison, I don't know if you need it or not)

Revision history for this message
kripken (kripkenstein) wrote :

Here is the /var/log/Xorg.0.log file after re-enabling the NVidia binary driver. The result after a reboot was 640x480 with no way to change it in the preferences. Yet, metacity was working at least.

Revision history for this message
kripken (kripkenstein) wrote :

And here is the /var/log/Xorg.0.log file after logging out and killing X (control-alt-delete). Suddenly 1024x768 resolution was working, yet the refresh rate was 50, not the 60 I usually see. This is an LCD monitor, though, so refresh rate doesn't really mean much, I think.

This time metacity wasn't working. I had to do "metacity" in a terminal.

So, in two attempts, I had one time a bad resolution with metacity, and once a good resolution without metacity. Two different symptoms, and their behavior seems quite erratic.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Pushing this back to unconfirmed.

Changed in linux-restricted-modules-2.6.20:
status: Needs Info → Unconfirmed
Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

kripkenstein
(That refresh rate really does matter (especially on LCDs). It is almost always 60. Trust me on this one... )

(WW) NVIDIA(GPU-0): Unable to read EDID for display device CRT-0

Looks like the driver was unable to ask the monitor its capabilities. This could be down to driver bugs (most likely if nv works), dodgy monitor cables, dodgy monitor etc. I'd say you have two options:

1. Try a later version of the nvidia drivers (nvidia-glx-legacy).
2. Work around the issue by hard coding your monitor's capabilities into xorg.conf

Before trying 2 make a copy of /etc/X11/xorg.conf in case this goes horribly wrong. To attempt 2. try using
sudo dpkg-reconfigure xserver-xorg
and select nvidia as the driver to use. Most things can be left at their defaults but try and attempt monitor autodetection and ensure the video modes you want to do are selected and that the horizontal/vertical sync are also set. Please post your results back here.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

(btw don't use 60 as the sync range in dpkg-reconfigure - those fields should be ranges and there should be a suggestion in there already)

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Setting to needsinfo pedning reply from kripkenstein.

Changed in linux-restricted-modules-2.6.20:
status: Unconfirmed → Needs Info
Revision history for this message
kripken (kripkenstein) wrote :

Indeed, things went 'horribly wrong' :)

"sudo dpkg-reconfigure xserver-xorg" appeared to go well, monitor resolution AND refresh were written correctly during setup - 1024x768@60. Yet, when actually trying to use them, metacity once more failed to load, and what was much worse, I couldn't use the keyboard on a terminal window (perhaps the x.org changes modified my keyboard incorrectly?). A failsafe GNOME session loaded with metacity fine. I tried once more normally, and the resolution was too low, and again no metacity.

I again used a failsafe GNOME, and restored the old xorg.conf. This didn't work, surprisingly! I restarted yet again, and removed the nvidia binary driver via the Restricted Drivers Manager. Another restart, and resolution&refresh is fine in "nv". Yet metacity is not loaded - perhaps this is an unrelated issue, I am starting to think.

Anyhow, I am stuck with "nv" once more. Now, I am quite sure the problem isn't my monitor or graphics card; both worked fine with Edgy until 2 days ago. It seems a big coincidence for one of them to fail just as I install Feisty.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

kripkenstein:
Phew what a ride!

Xorg changes can't affect your virtual terminals. If you could switch to a virtual terminal (and it wasn't corrupt/completely black) but not type then that suggests something more severe. If you were unable to type within X though then yes that could be a xorg.conf misconfiguration.

The metacity problem sounds unrelated to this though. I would spin off a new bug about that problem and add a link to the new bug back here. I would also attach the ~/.xsession-errors file that is produced during a faulty run (e.g. log in trying to metacity then once hard disk stops flickering log in on a virtual terminal and copy that file to different name). Did you ever try and turn on desktop effects?

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

By the way when I said:
Try a later version of the nvidia drivers (nvidia-glx-legacy).
I meant to say
Try a later version of the nvidia drivers (nvidia-glx-new).

Revision history for this message
kripken (kripkenstein) wrote :

The problem with typing in a terminal was only with terminals during X. So perhaps this was indeed a bad xorg.conf setting. Or perhaps related to the Metacity issue.

I tried the nvidia-glx-new. It couldn't even start X. So I am back to "nv".

I will see if the Metacity issue repeats itself in "nv". If so then I will indeed file a bug. The last 2 times with "nv" Metacity loaded fine.

No, I didn't try to turn on desktop effects - I was afraid of breakage ;) . This is a fresh install of Feisty, nothing fancy.

Actually I am quite surprised, after almost no problems with Dapper and Edgy, Feisty is proving to be quite tricky on this computer...

Any other ideas for me? (Sorry to waste all your time thus far...)

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

kripkenstein:
Did you reboot after installing nvidia-glx-new ?

The only other idea I have is disabling EDID probing . See Bug #105957 ...

Revision history for this message
kripken (kripkenstein) wrote :

Yes, I rebooted before trying the new one. Didn't help.

Now I tried to use the old one with the EDID fix, but I can't get it to load either. It crashes X with "the kernel module has version 9755 while this X module has version 9631"...

I thought this might be because of the -new package that I tried, so I tried to reinstall all NVidia modules as well as restricted-drivers in Synaptic, but again, the same error after a reboot... very sad...

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

kripkenstein:
remove the nvdia-glx-new package. Check that ls -al /lib/linux-restricted-modules/2.6.17-11-generic/ doesn't show a left over .nvidia_new_installed file. If it does then remove .nvidia_new_installed and install nvidia-glx (see Bug #106217 )

Revision history for this message
kripken (kripkenstein) wrote :

Ok, that fixed the X module/kernel module mismatch problem (indeed, it was exactly the same as the bug you quoted).

Now the NVidia binary can start, but only at 640x480, as before. Manually editing and adding resolutions to xorg.conf doesn't work. Nor does disabling EDID probing, that has no effect...

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

kripkenstein:
OK let's see if we can take a hybrid approach this problem. Could you attach your current /etc/X11/xorg.conf , then could you back it up and attach the xorg.conf generated by "sudo dpkg-reconfigure xserver-xorg" here too. Copy the old working configuration back and I'll see if I can synthesize a new config for you to test with from the result...

Revision history for this message
kripken (kripkenstein) wrote :

Not sure I understand you. Should I post the current xorg.conf under "nv" (at 1024x768), and then under "nvidia" with "sudo dpkg-reconfigure xserver-xorg"?

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

kripkenstein:
Yup.

Revision history for this message
kripken (kripkenstein) wrote :

Ok, here is my working xorg.conf under "nv", 1024x768.

Revision history for this message
kripken (kripkenstein) wrote :

and here it is after installing the NVidia binary driver, but before ""sudo dpkg-reconfigure xserver-xorg". Here I am stuck on 640x480.

Revision history for this message
kripken (kripkenstein) wrote :

Here it is after "sudo dpkg-reconfigure xserver-xorg". Interestingly, it worked this time at 1024x768! The only quirk was that it says in the preferences that the refresh rate was 50; yet the monitor itself, and the nvidia-settings, said that it was 75. I used nvidia-settings to fix the refresh rate at 60, and the monitor reports that too, so that seems ok for now. The preferences is still wrong though, now it says 76.

So, this is very strange, things seem to work. I suspect that next reboot they may stop, though...

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

kripkenstein:
The wrong refresh rate sounds similar to Bug #104105 ...

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

kripkenstein:
Does following the advice in https://bugs.launchpad.net/ubuntu/+source/linux-restricted-modules-2.6.20/+bug/104105/comments/9 make any difference?

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Still awaiting reply from kripkenstein.

Revision history for this message
kripken (kripkenstein) wrote :

Sorry for not replying earlier. Basically my 'solution' to my problems is never rebooting the computer, in fact, I don't think I rebooted since last I posted here - perhaps once). So I am not sure if things are not working or not, after trying your suggestions. Last reboot they worked, but that is all I know, things were flaky before (work once, fail another).

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

kripkenstein:
Unless you can report whether the problem is still here (by rebooting a few times and seeing of the problem occurs) this bug is going to have to be closed...

Revision history for this message
kripken (kripkenstein) wrote :

Ok, I understand.

Well, I tried it now, and it seems fine, not sure what did the trick though. So I guess this bug can be closed.

I attach my current, final, xorg.conf, perhaps someone will want to compare it to my old nonworking versions if they find they have a similar problem or something like that.

Thanks for your time in all of this, Sitsofe!

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

New xorg.conf has HorizSync and VertRefresh entries. Hopefully this should solve the low resolution problem. Opening and final comments were about resolution being too low so we shall consider that to be this bug report's overriding issue.

Thanks for taking the time to report this bug and helping to make Ubuntu better. This particular bug has already been reported and is a duplicate of bug #3731 and is being marked as such. Please feel free to report any other bugs you may find.

Changed in linux-restricted-modules-2.6.20:
status: Needs Info → Rejected
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.