NVIDIA - Xorg crashes (empty backtrace) with segfault OR "general protection" in ld-2.10.1.so

Bug #507233 reported by Ongion
38
This bug affects 6 people
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-180 (Ubuntu)
Triaged
High
Unassigned

Bug Description

Binary package hint: xorg

Nvidia binary driver version 185 on a GTX260M.

Crashes are seemingly random (anywhere from 30 seconds to 2 hours logged in), but always present with no backtrace in Xorg.0.log.old and either a segfault or general protection in ld-2.10.1.so. 173 driver does not crash, but is excruciatingly slow. I tested the 190 and 195 drivers, but they do not fix the problem. They have been completely removed from the system.

Any help would be greatly appreciated.

ProblemType: Bug
Architecture: amd64
Date: Wed Jan 13 17:36:12 2010
DistroRelease: Ubuntu 9.10
InstallationMedia: Ubuntu 9.10 "Karmic Koala" - Release amd64 (20091027)
MachineType: Alienware M15x
NonfreeKernelModules: nvidia
Package: xorg 1:7.4+3ubuntu10
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.31-18-generic root=UUID=5f861888-1c0e-466e-b0bc-3aa962417bd0 ro quiet splash
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.31-18.55-generic
RelatedPackageVersions:
 xserver-xorg 1:7.4+3ubuntu10
 libgl1-mesa-glx 7.6.0-1ubuntu4
 libdrm2 2.4.14-1ubuntu1
 xserver-xorg-video-intel 2:2.9.0-1ubuntu2.1
 xserver-xorg-video-ati 1:6.12.99+git20090929.7968e1fb-0ubuntu1
SourcePackage: xorg
Uname: Linux 2.6.31-18-generic x86_64
dmi.bios.date: 11/16/2009
dmi.bios.vendor: Alienware
dmi.bios.version: A02
dmi.board.vendor: Alienware
dmi.board.version: A02
dmi.chassis.type: 8
dmi.chassis.vendor: Alienware
dmi.chassis.version: A02
dmi.modalias: dmi:bvnAlienware:bvrA02:bd11/16/2009:svnAlienware:pnM15x:pvrA02:rvnAlienware:rn:rvrA02:cvnAlienware:ct8:cvrA02:
dmi.product.name: M15x
dmi.product.version: A02
dmi.sys.vendor: Alienware
fglrx: Not loaded
system:
 distro: Ubuntu
 architecture: x86_64kernel: 2.6.31-18-generic

Revision history for this message
Ongion (benpy2k) wrote :
Revision history for this message
Ongion (benpy2k) wrote :

It crashed in between me submitting the bug and uploading this.

Revision history for this message
Ongion (benpy2k) wrote :

Oh, sorry, should include this.

Xorg crashes then immediately brings me to the login screen.

Bryce Harrington (bryce)
affects: xorg (Ubuntu) → nvidia-graphics-drivers-180 (Ubuntu)
Revision history for this message
Ongion (benpy2k) wrote :

I would love some help on this, as I am unable to use my computer for very long most of the time. If I need to post anything more, just say the word.

Revision history for this message
Ongion (benpy2k) wrote :

I am looking for patterns in the crash, as I know "random" is a very... vague term. I've noticed that the first crash takes about 20 to 30 minutes. All subsequent crashes occur in under 10 minutes. This pattern lasts until a complete reboot.

Revision history for this message
Ongion (benpy2k) wrote :

The system appears to be much less stable when on battery power. Just a thought.

Revision history for this message
Amy Wilson (awils-1) wrote :

Hi there, Onigion.

I doubt we will get much in the way of support, because we are being mighty evil and using a proprietary driver (and I've seen the support around these places; makes a dedicated supporter cringe), but alas, I am experiencing the same problem, on the exact same config --- it is definitely unstable on battery (I've not experienced a crash on AC power), and one can undergo several crashes after the first (whereupon it may or may not stop crashing).

Revision history for this message
Amy Wilson (awils-1) wrote :
Revision history for this message
Ongion (benpy2k) wrote :

Nice to know I'm not alone in this problem, at least :-). I'm beginning to think it might have something to do with underclocking the GPU when on battery power. It's kind of a stretch, but that does explain why the crash happens more often (or only, in your case) on battery.

Revision history for this message
Ongion (benpy2k) wrote :

I think I'll do a fresh install using a Lucid daily build. Maybe it'll be fixed there. Or maybe it'll be fixed by a reinstallation. I hope...

Revision history for this message
Amy Wilson (awils-1) wrote :

Ack, a reinstall? I'm still fiddling with the X server settings --- on the assumption that it is related to underclocking (it also occurred again whilst on charge). There is the Powermizer settings - but you don't seem to able to customize the hertz rate of each performance level.

It'd help if I why opening Firefox is a big cause of the crashes (but not the only one).

Revision history for this message
Ongion (benpy2k) wrote :

I was doing a bit more research before a reinstall. ld-2.10.1.so is provided by libc6, for what it's worth. I think I'll try a dist-upgrade first. Maybe something got solved between Karmic and Lucid, and it could be in the Alpha. If that doesn't work I'll try a full reinstall.

Revision history for this message
Ongion (benpy2k) wrote :

And by dist-upgrade I mean "update-manager -d". :-P

Revision history for this message
Jochen Kemnade (jochenkemnade) wrote :

this just happened to me too on a Karmic system with a GeForce GTX 280M and the nvidia185 drivers.

Changed in nvidia-graphics-drivers-180 (Ubuntu):
status: New → Confirmed
Revision history for this message
Ongion (benpy2k) wrote :

Upgrading to Lucid was unsuccessful, so I did a full install of Alpha 2. This appeared to fix this bug, only to replace it with another. Xorg still crashes with no backtrace, but there is no segfault or "general protection" on anything. I wouldn't mark this as closed, however, because it may still be related to the same thing.

Revision history for this message
Ongion (benpy2k) wrote :

One more thing I should mention. It is still much more stable than before, however, the instability will also occur on AC. Personally, I feel that the overall stability is more helpful because I use this laptop on battery a lot, but you may think differently.

Revision history for this message
Knut (knutjorgen) wrote :

I use to similar system. The system frezze when using ubuntu 9.10 with nvidia 195 on a G210, but When I uses Fedora 12 I had zeor frezees.

Bryce Harrington (bryce)
tags: added: karmic
Revision history for this message
Meng (georgew666) wrote :

Similar thing has been with my Alienware M15x with GTX 260M. This mostly happens when I use firefox with flash. The dmesg shows the same Xorg segfault with ld-2.10.1.so.

Revision history for this message
Amy Wilson (awils-1) wrote :

I seem to have fixed it by uninstalling, and downgrading the driver to 173. Not ideal, but stable :|.

Revision history for this message
Anders Aagaard (aagaande) wrote :

See http://www.nvnews.net/vbulletin/showthread.php?t=142946

My workaround on alienware m15x with gtx 260m is every startup/every suspend2ram cycle switch to vt1 back to active vt and click the "stealth mode" button on my alienware on and off. I'm guessing this is enough to reset settings.

On a side note I've lost all respect for nvidia, this made my computer useless for months and I've had no reply from nvidia. While threads with multi gpu/multi head issues gets replies within a days.

Revision history for this message
Vadim Tkachuk (vadim-tkachuk) wrote :

Same issue for me as well on my m15x with 260m. I even tried the latest drivers from nvidia (195.36.15) on openSUSE 11.2, Mandriva 2010, and Ubuntu 9.10 and no go. I guess I will just use windows until the issue is resolved.

Revision history for this message
Anders Aagaard (aagaande) wrote :

Vadim note that the workaround in post #58 http://www.nvnews.net/vbulletin/showthread.php?t=142946 (and comment #20) do work. A little annoying but it's stable for me.

Also bumping http://www.nvnews.net/vbulletin/showthread.php?t=142946 might at some point get nvidia to look at the issue.

Revision history for this message
Meng (georgew666) wrote :

Hi Aagaard,

How did you get it stable with the stealth button? I am using an m15x. Is it the one that looks like a UFO (second one from the right hand side) or the one that looks like powermeter (first one from left hand side)? I tried both but nothing seem to change in Linux.

Thanks.

Revision history for this message
Anders Aagaard (aagaande) wrote :

It's the powermeter. It doesn't seem to do anything, but it does!

Turn it on (it'll turn a TINY bit brighter) and off again (or you'll defenatly be unstable) and for me it's WAY better than it was atleast.

I've had some crashes, but only with the power cord out (which also adds to instability :/ )

Revision history for this message
Meng (georgew666) wrote :

To people having this problem with Alienware m15x,

I seem to find a way out to stabilize the problem. Ubuntu 10.04 with (almost) latest updates

Linux Kernel 2.6.32-22-generic
Nvidia driver currrent 195.36.24 (through ubuntu package)
xserver-xorg 1:7.5+5ubuntu1
xserver-xorg-core 2:1.7.6-2ubuntu
xserver-common 2:1.7.6-2ubuntu

has been running on my M15x for a week without any problem I encountered before. The flash version is 10.1 (released days before, but it's not the cure I suppose since the problem is gone days before its release).

Basically I upgraded to 10.04 in late April, but the problem persists. My problem has been the crash of x server when playing flash in Firefox for a few minutes, although it suffers from random crash when using other applications, mostly when a drawing operation using OpenGL is underway.

The problem was gone like a miracle two weeks ago. To verify it I play several flash videos on youtube simultaneously for several hours. A few days later, my GPU fan suddenly had a little problem and I asked DELL to come to help me with it. During that time, I noticed that guy unplugged the power jumper of the GPU fan (which is on the right hand side if you turn the notebook upside down with the battery side near you, the power jumper is just visible). Later he plugged it back again and left. That night I found the old problem came back again. I remembered what the guy did, removed the undercover of the laptop, and pushed the power jumper so that it fits tightly. Then I opened the computer and tested it with several flash videos. The problem is gone once again.

Therefore, I think that what Anders Aagaard says is correct. There can be some hardware problem, most likely power supply, with the laptop.

Revision history for this message
Anders Aagaard (aagaande) wrote :

It's not a hardware issue, using 256.29 I don't get the issue at all with the power cable in. Of course when you pull it out you end up on battery saving speeds, and then I run into issues.

If it was a hardware issue related to cooling, running with power at max speed would show the issue, and it doesn't.

Revision history for this message
Meng (georgew666) wrote :

The problem still exists. However something new is found.

Most of the time, the system is stable when an external monitor is plugged in. I guess this is because the powermizer setting is switched to max performance.

Revision history for this message
Meng (georgew666) wrote :

Now I am able to stabilize the system with the following trick

$sudo nvidia-xconfig --registry-dwords="PowerMizerEnable=0x1; PerfLevelSrc=0x2222; PowerMizerDefault=0x3; PowerMizerDefaultAC=0x1"

The /etc/X11/xorg.conf should have a line in Section "Screen" like

        Option "RegistryDwords" "PowerMizerEnable=0x1; PerfLevelSrc=0x2222; PowerMizerDefault=0x3; PowerMizerDefaultAC=0x1"

Explanation:

According to http://tutanhamon.com.ua/technovodstvo/NVIDIA-UNIX-driver/, the power setting is configurable. I believe that it is the Adaptive thing that makes X unhappy when it switches performance levels. The reason why it becomes stable when an external display is plugged in is also probably due to switching to max performance and stays at that level. Therefore, I simply disable it when on AC power (using Max performance setting) and on battery (using min performance).

Revision history for this message
Amy Wilson (awils-1) wrote :

I no longer can use the 173 version of the NVidia driver without severely impacting overall performance, and the latest (Lucid, version no 195) is buggier than ever -- crashing on AC and battery.

Also, Meng, I tried using your trick but it broke my X display completely. It couldn't find a screen. Do you think you could post your xorg.conf so I could see exactly what it contains?

Revision history for this message
Meng (georgew666) wrote :
Download full text (3.4 KiB)

For your convenience

# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig: version 1.0 (buildmeister@builder58) Thu Apr 22 20:35:23 PDT 2010

# nvidia-settings: X configuration file generated by nvidia-settings
# nvidia-settings: version 1.0 (buildd@yellow) Fri Apr 9 11:51:21 UTC 2010

Section "ServerLayout"
    Identifier "Layout0"
    Screen 0 "Screen0" 0 0
    InputDevice "Keyboard0" "CoreKeyboard"
    InputDevice "Mouse0" "CorePointer"
    Option "Xinerama" "0"
EndSection

Section "Module"
    Load "glx"
EndSection

Section "InputDevice"
    Identifier "Mouse0"
    Driver "mouse"
    Option "Protocol" "auto"
    Option "Device" "/dev/psaux"
    Option "Emulate3Buttons" "no"
    Option "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"

 # generated from default
    Identifier "Keyboard0"
    Driver "kbd"
EndSection

Section "Monitor"
    Identifier "Monitor0"
    VendorName "Unknown"
    ModelName "LGD"
    HorizSync 30.0 - 75.0
    VertRefresh 60.0
    Option "DPMS"
EndSection

Section "Monitor"
    Identifier "Monitor1"
    VendorName "Unknown"
    ModelName "DELL 1908WFP"
    HorizSync 30.0 - 83.0
    VertRefresh 56.0 - 75.0
    Option "DPMS"
 # HorizSync source: edid, VertRefresh source: edid
EndSection

Section "Device"

# level 0x1 = highest
# level 0x2 = med
# level 0x3 = lowest
# Option "RegistryDwords" "PowerMizerLevel=0x2"
    Identifier "Device0"
    Driver "nvidia"
    VendorName "NVIDIA Corporation"
    BoardName "GeForce GTX 260M"
EndSection

Section "Device"
    Identifier "Device1"
    Driver "nvidia"
    VendorName "NVIDIA Corporation"
    BoardName "GeForce GTX 260M"
    BusID "PCI:2:0:0"
    Screen 1
EndSection

Section "Screen"

 # Removed Option "TwinView" "0"
 # Removed Option "metamodes" "DFP: nvidia-auto-select +0+0"
 # Removed Option "metamodes" "CRT: nvidia-auto-select +1920+0, DFP: nvidia-auto-select +0+0"
 # Removed Option "TwinView" "1"
 # Removed Option "metamodes" "DFP: nvidia-auto-select +0+0, CRT: nvidia-auto-select +1920+0"
# Removed Option "TwinView" "0"
# Removed Option "metamodes" "nvidia-auto-select +0+0"
# Removed Option "TwinView" "1"
# Removed Option "metamodes" "CRT: nvidia-auto-select +1920+0, DFP: nvidia-auto-select +0+0"
    Identifier "Screen0"
    Device "Device0"
    Monitor "Monitor0"
    DefaultDepth 24
    Option "NoLogo" "True"
    Option "TwinView" "0"
    Option "TwinViewXineramaInfoOrder" "DFP-0"
    Option "metamodes" "DFP: nvidia-auto-select +0+0"
    Option "RegistryDwords" "PowerMizerEnable=0x1; PerfLevelSrc=0x2222; PowerMizerDefault=0x3; PowerMizerDefaultAC=0x1"
    SubSection "Display"
        Depth 24
    EndSubSection
EndSection

Section "Screen"
    Identifier "Screen1"
    Device "Device1"
    Monitor "Monitor1"
    DefaultDepth ...

Read more...

Revision history for this message
Patrick Golec (patrick-golec) wrote :

Meng, I had the same problem - your solution seemed to have solved it, many thanks for posting it. Haven't tried on battery power yet, but I haven't had a single crash since applying the "trick" and running on AC power. Also, I have an external monitor plugged it - but before the fix, I would experience crashes quite frequently with an external monitor, too...

Revision history for this message
Mark K. (mbk-cs) wrote :

Meng, thanks for the solution. It seems (cross fingers) to be working on my m11x.

Revision history for this message
Del (delonly) wrote :

Think I have tried just about any solution proposed now without success.

I am running Kubuntu 11.10 AMD64 on an Alienware M15X with GTX260M graphics card. Nvidia proprietary driver version 280.13.

The xorg.conf file posted did not set Nvidia PowerMizer to Prefer Maximum Performance here. When changing from Adaptive to Prefer Maximum Performance it does seem more stable, but xserver crashes do still occur anyway.

The laptop seems stable when power cable is plugged in, I have only seen crashes when it is on battery power (consitent with multiple other users around the net).

I have tried setting grub parameter as suggested in this forum thread:
http://ubuntuforums.org/showthread.php?p=11310026
It did nothing for me.

Any suggestions are highly appreciated. Can anyone report a success story with this laptop? As it is not my laptop, hack involving swithcing back and forth to VT is unfortunately not an option.

Bryce Harrington (bryce)
Changed in nvidia-graphics-drivers-180 (Ubuntu):
importance: Undecided → High
status: Confirmed → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.