xorg crashes randomly with seg fault

Bug #620478 reported by Stephan
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

This is similar to bug 578750, but on my system this particular bug only started to occur recently.

I'm getting frequent crashes of the X server with this stacktrace in the logs:

Backtrace:
0: /usr/bin/X (xorg_backtrace+0x3b) [0x80e938b]
1: /usr/bin/X (0x8048000+0x61c8d) [0x80a9c8d]
2: (vdso) (__kernel_rt_sigreturn+0x0) [0x7c8410]
Segmentation fault at address 0x40080884

This was without debugging symbols, a version after I installed xorg with debugging symbols can be found in Xorg.0.log.old

More info:
I'm using "nv" driver with TwinView (via xorg.conf) and this configuration has been working pretty stable for a while. I can't precisely say, when instability returned, but possibly the upgrade of xserver-xorg-core to 2:1.7.6-2ubuntu7.3 escalated this situation. I did the upgrade three days ago, and while it feels that the situation started earlier than that, during the last (3) days the system crashed several times a day (or even: hour).

Suspecting TwinView to have a finger in the pie I disabled TwinViw but X crashed only a few seconds after restarting.

It can't see a connection to any application, usually I'm in Eclipse (with gtk) when it happens, but that's only where I spend most of the time. The crash might be related to either some mouse activity (never saw it without doing something with the mouse). One crash, e.g., happened about when I clicked on the (KDE) desktop.

Another symptom I've been seeing is not crashing but freezing (no reaction to any input device, including not-working keyboard LEDs) - not sure if this is related. Usually, when the system freezes first the full screen flickers shortly before the freeze - and I believe this symptom equally applies to the freeze and the crash. At other times the flicker occurs and after a second or so that system becomes responsive again (micro freeze, you might say).

Please let me know what further information would help to identify the problem.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: xserver-xorg-core 2:1.7.6-2ubuntu7.3
ProcVersionSignature: Ubuntu 2.6.32-24.39-generic 2.6.32.15+drm33.5
Uname: Linux 2.6.32-24-generic i686
NonfreeKernelModules: nvidia
Architecture: i386
Date: Thu Aug 19 16:45:28 2010
DkmsStatus:
 nvidia-current, 195.36.24, 2.6.32-23-generic, i686: installed
 nvidia-current, 195.36.24, 2.6.32-24-generic, i686: installed
Lsusb:
 Bus 002 Device 003: ID 045e:00dd Microsoft Corp.
 Bus 002 Device 002: ID 046d:c050 Logitech, Inc. RX 250 Optical Mouse
 Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 001 Device 004: ID 058f:6362 Alcor Micro Corp. Hi-Speed 21-in-1 Flash Card Reader/Writer (Internal/External)
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: System manufacturer System Product Name
ProcCmdLine: root=UUID=0b16b3df-690d-4404-afc9-9560c74fdc5a ro quiet splash noapic
ProcEnviron:
 LANGUAGE=en_AU:en
 PATH=(custom, user)
 LANG=en_AU.UTF-8
 SHELL=/bin/bash
SourcePackage: xorg-server
dmi.bios.date: 06/15/2006
dmi.bios.vendor: Phoenix Technologies, LTD
dmi.bios.version: ASUS M2N-SLI DELUXE ACPI BIOS Revision 0304
dmi.board.name: M2N-SLI DELUXE
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: 1.XX
dmi.chassis.asset.tag: 123456789000
dmi.chassis.type: 3
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnPhoenixTechnologies,LTD:bvrASUSM2N-SLIDELUXEACPIBIOSRevision0304:bd06/15/2006:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKComputerINC.:rnM2N-SLIDELUXE:rvr1.XX:cvnChassisManufacture:ct3:cvrChassisVersion:
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer
system:
 distro: Ubuntu
 codename: lucid
 architecture: i686
 kernel: 2.6.32-24-generic

Revision history for this message
Stephan (stephan-h) wrote :
affects: xorg-server (Ubuntu) → nvidia-graphics-drivers (Ubuntu)
Revision history for this message
Stephan (stephan-h) wrote :

As I just had another crash (exact same backtrace), I found that it correlates
also to these entries in syslog:

[ 2610.985055] NVRM: Xid (0007:00): 6, PE0001
...

Interestingly, this problem is logged several times without crashing
(might correlate with those micro-freezes I mentioned?),
but eventually the NVRM: Xid precedes the death of X and kdm
(in this case with a 7 second delay, hm)

That machine worked pretty stable until some days ago, but at this point
it is not usable for work: on one hand crashes happen frequently,
but I cannot see it correlate to any specific action that I could avoid.

All this implies that I'm more than willing to provide additional information
for analysing the problem.

Revision history for this message
Stephan (stephan-h) wrote :

Update: I gave it another chance after finding bug 526857 comment 5.
I applied that workaround and the machine has been running now for
a few hours without crash nor freeze.

I still had a few
  NVRM: Xid (0007:00): 6, PE0001
which did not correlate to any observable freeze/flicker, then:
Aug 21 18:19:49 luna kernel: [ 5907.594173] NVRM: Xid (0007:00): 3, C 00000001 SC 00000002 M 00000860 Data ffffff78
Aug 21 19:06:59 luna kernel: [ 8736.997129] NVRM: Xid (0007:00): 3, C 00000001 SC 00000002 M 000002ac Data 00000004
Aug 21 19:07:03 luna kernel: [ 8741.289690] NVRM: Xid (0007:00): 3, C 00000001 SC 00000002 M 000002ac Data 00000004
Aug 21 19:07:07 luna kernel: [ 8745.370445] NVRM: Xid (0007:00): 3, C 00000001 SC 00000002 M 000002ac Data 00000004
which caused a flicker and a temporary freeze. The time between
log-entries could well be equal to the time of the freeze.

Revision history for this message
Stephan (stephan-h) wrote :

Ouch, machine died again (X crash this time):

Backtrace:
0: /usr/bin/X (xorg_backtrace+0x3b) [0x80e938b]
1: /usr/bin/X (0x8048000+0x61c8d) [0x80a9c8d]
2: (vdso) (__kernel_rt_sigreturn+0x0) [0x461410]
Segmentation fault at address 0x40300884

Caught signal 11 (Segmentation fault). Server aborting

This time nothing with NVRM: Xid.
Unsure if this means that these are two different bugs.
At least now we know that the workaround from bug 526857 comment 5
is no solution for this problem.

Revision history for this message
Stephan (stephan-h) wrote :

Some more updates:

- I suspected a broken CD drive to influence the behavior, disconnected the CD and again saw some crashes.

- I tried all 3 performance levels (see bug 526857 comment 5), X still crashes in all these configurations.

- I observed one freeze (not crash) while logging into KDE, no mouse action at that time. (see above hypothesis about a connection between mouse activity and the bug). Because it was a hard freeze (not even sys-req emergency umount or emergency reboot worked), no traces of the problem could be found on the machine.

-----------------------

Given the lack of activity in this report, let me ask whether there is any use in providing more info in this bug?

I intentionally did not try any other versions of any software involved, because the infected machine could be a good testbed for future attempts to narrow down what exactly is broken here.

However, I can't wait forever, since, as mentioned repeatedly, the machine is in this state unusable for any serious work.
I will have to install a working OS on this machine soon.

Revision history for this message
Per Wahlström (per-wahlstrom) wrote :

Running Ubuntu 9.10, and after replacing motherboard, CPU and RAM, this started happening to me too.

Old setup, ----> no freezez/crashes:

MB: Asus P5W DH Deluxe
CPU: Core 2 Duo E6600
RAM: 4x1 Gb Corsair whatever
Asus nvidia 8800GT card
Ubuntu 9.10, kernel 2.6.31-14
nvidia driver version: 185.18.36

New setup ---> freezes occur:

MB: Asus P6T SE
CPU: Core i7 930
RAM: 6x2Gb Corsair XMS3 1333MHz
Same Asus nvidia 8800GT card
Same ubuntu installation on same HD

Running glxgears is a sure way to reproduce the system freeze: causes almost instant freeze, every time.

Watching video will cause a freeze after some time, if I have switched desktop with compiz fusion 3D cube desktop before.
Switching CPU to 1 core enables me to run glxgears for some time, but still occasional freeze can happen when for example searching on google and the search-box auto-complete is active.

Syslog after a 1-second 'mini-freeze':
kernel: [11144.963412] NVRM: Xid (0002:00): 6, PE007f
kernel: [11144.965148] NVRM: Xid (0002:00): 7, Ch 0000007f M 00001ffc D ffffffff intr ffffffff
kernel: [11144.966874] NVRM: Xid (0002:00): 26, Ch 0000007f M 00001ffc D ffffffff intr ffffffff
kernel: [11144.968600] NVRM: Xid (0002:00): 4, Ch 00000005 acquireValue 11111111 dmaPut 2004be6c dmaGet 2004be6c

Syslog after a total freeze 25 seconds later:
kernel: [11170.422373] NVRM: Xid (0002:00): 6, PE007f
kernel: [11170.425820] NVRM: Xid (0002:00): 7, Ch 0000007f M 00001ffc D ffffffff intr ffffffff
kernel: [11170.429223] NVRM: Xid (0002:00): 26, Ch 0000007f M 00001ffc D ffffffff intr ffffffff
kernel: [11181.488364] NVRM: Xid (0002:00): 8, Channel 00000005

Tried installing latest driver from nvidia (256.53), glxgears causes instant freeze here too.

To recap, new MB, CPU, RAM made the freezes start occurring.

Revision history for this message
Mike Homer (homerhomer) wrote :

I'm having this too. :(
[ 247.613222] NVRM: Xid (0002:00): 13, 0005 00000000 00008297 00001518 3f800000 0000000d
[ 248.530114] NVRM: Xid (0002:00): 13, 0005 00000000 00008297 00001518 3f800000 0000000d
[ 249.089444] NVRM: Xid (0002:00): 13, 0005 00000000 00008297 00001514 00000000 0000000d
[ 251.625693] NVRM: Xid (0002:00): 13, 0005 00000000 00008297 00001b0c 01007002 00000003
[ 255.682719] NVRM: Xid (0002:00): 13, 0005 00000000 00008297 00001518 3f800000 0000000d
[ 256.439137] NVRM: Xid (0002:00): 13, 0005 00000000 00008297 00001420 00000000 00000004
[ 262.402521] NVRM: Xid (0002:00): 13, 0005 00000000 00008297 00001340 00000006 00000005
[ 278.367232] NVRM: Xid (0002:00): 13, 0005 00000000 00008297 00001b0c 10007010

Revision history for this message
dino99 (9d9) wrote :

That version is no more supported; please open a new bug report if the actual archive found version also has the same issue.

Changed in nvidia-graphics-drivers (Ubuntu):
status: New → Invalid
Revision history for this message
Stephan (stephan-h) wrote :

"invalid" is a funny classification for a bug that made Ubuntu unusable on certain machines at the time of reporting.

After 5 years with no action you may be sure that I'm not using a nvidia card/driver any more, bye.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.