random Xorg crash

Bug #620278 reported by boblinux
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Hello,

I spent a lot of time browsing the Web and Launchpad before posting; actually, I did not find any fix to my issue.

My crash appears randomly. It has been happening for months, across many Lucid upgrades. I updated Lucid yesterday.

It's pretty hard to find constants to help the diagnostic. However :

- the crash ALWAYS happen on keyboard input - and keyboard input only - after pressing the Enter key - but really not often - I can work for hours (or maybe days) before the crash happens
- I am working a lot into xterms, many of my crashes happen into an xterm, but into Firefox as well
- the following string seems to be doomed : many of my crashes happen when I press Enter in the xterm at its end :

cd sancho_ed2k/sancho-0.9.4-59-linux-gtk

this morning's crash happened in this situation.
I must mention that I am connected in this xterm in ssh into a second desktop

After having reentered my session, the same command in the remote ssh session did not trigger the crash : it looked like some filled buffer which cannot receive more data and crashed X ?

- changing directory with konqueror does not (seem to) trigger the crash
- I am not sure at 100%, but it may seem that the crash appears more often with kdm rather than gdm - however, this morning the crash happened with gdm started

When I have time (this evening probably) I will try to follow https://wiki.ubuntu.com/X/Backtracing in order to add some informations to this case.

Hope a solution will be found, it is a stressful bug

Regards
R. Grasso

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: xorg 1:7.5+5ubuntu1
ProcVersionSignature: Ubuntu 2.6.32-24.39-generic 2.6.32.15+drm33.5
Uname: Linux 2.6.32-24-generic x86_64
NonfreeKernelModules: nvidia
Architecture: amd64
Date: Thu Aug 19 07:16:16 2010
MachineType: System manufacturer P5K
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-24-generic root=UUID=9db8108a-446b-454a-a914-cc1049a6ed07 ro console=tty1
ProcEnviron:
 LC_COLLATE=C
 PATH=(custom, user)
 LANG=en_US.utf8
 SHELL=/usr/bin/tcsh
SourcePackage: xorg
Symptom: display
Title: Xorg crash
dmi.bios.date: 10/14/2008
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 1201
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: P5K
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: Rev 1.xx
dmi.chassis.asset.tag: Asset-1234567890
dmi.chassis.type: 3
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr1201:bd10/14/2008:svnSystemmanufacturer:pnP5K:pvrSystemVersion:rvnASUSTeKComputerINC.:rnP5K:rvrRev1.xx:cvnChassisManufacture:ct3:cvrChassisVersion:
dmi.product.name: P5K
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer
system:
 distro: Ubuntu
 codename: lucid
 architecture: x86_64
 kernel: 2.6.32-24-generic

Revision history for this message
boblinux (robert-grasso) wrote :
Revision history for this message
Leo Arias (elopio) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. This bug did not have a package associated with it, which is important for ensuring that it gets looked at by the proper developers. You can learn more about finding the right package at https://wiki.ubuntu.com/Bugs/FindRightPackage. I have classified this bug as a bug in xorg.

When reporting bugs in the future please use apport, either via the appropriate application's "Help -> Report a Problem" menu or using 'ubuntu-bug' and the name of the package affected. You can learn more about this functionality at https://wiki.ubuntu.com/ReportingBugs.

affects: ubuntu → xorg (Ubuntu)
Revision history for this message
Leo Arias (elopio) wrote :

Marking as incomplete until you provide the debugging information required to find the cause of the issue.
Please look at https://wiki.ubuntu.com/X/Troubleshooting and follow the guide that corresponds to your problem. Report back any useful findings and add as separate attachments all the requested files.

Thanks.

Changed in xorg (Ubuntu):
status: New → Incomplete
Revision history for this message
boblinux (robert-grasso) wrote :

Hello Leo,

About the missing package : I removed it on purpose. Well, according to https:/ /wiki.ubuntu. com/Bugs/ FindRightPackage, should I understand that you will use it as a pointer to begin the troubleshooting search ? As a senior Unix sysadmin, I did not feel correct to state what seemed to be the failed package AT FIRST, as to me, on the contrary, THIS should be the diagnostic GOAL : finding the true cause : I was wondering these days that it could be kdm who transmitted incorrect informations to X. So I did not agree with stating xorg as the failed package at once. If this would have been my initial assumption, I would have stated it clearly in my post.

Revision history for this message
boblinux (robert-grasso) wrote :

Hello Leo,

About using https://wiki.ubuntu.com/X/Troubleshooting as a guidline for adding meaningful informations : the only entry point in this document, matching my case, is :

"Problem results in X crash or exit"

however, the only suggestion is to run "gdb to get a backtrace" : well, what process should I run under gdb : X ? and unfortunately I have not been developing for years, so using gdb as an expert in order to report valuable information, as stated by this paragraph, is clearly beyond my current abilities - and additionally, I am not sure it would actually help : what if X is not the true culprit ?

I don't know what you think about it. My initial intention, as I stated in my first post, was to follow the guidelines provided by

https://wiki.ubuntu.com/X/Backtracing

I saw in there many interesting suggestions, and my feeling was that these are general and broad enough, and they would be more useful in order to deal with a random and unknown problem.

What is your opinion ?

Currently, my best starting point is the backtrace from Xorg.0.log, also found in the kdm log or in the gdm log :

Backtrace:
0: /usr/bin/X (xorg_backtrace+0x28) [0x4a3258]
1: /usr/bin/X (0x400000+0x655bd) [0x4655bd]
2: /lib/libpthread.so.0 (0x7f025f381000+0xf8f0) [0x7f025f3908f0]
3: /lib/libc.so.6 (__select+0x13) [0x7f025e138fc3]
4: /usr/bin/X (WaitForSomething+0x1ba) [0x45f98a]
5: /usr/bin/X (0x400000+0x30952) [0x430952]
6: /usr/bin/X (0x400000+0x261aa) [0x4261aa]
7: /lib/libc.so.6 (__libc_start_main+0xfd) [0x7f025e078c4d]
8: /usr/bin/X (0x400000+0x25d59) [0x425d59]

Caught signal 3 (Quit). Server aborting

Other people posted the same backtrace, unfortunately it seems that this same backtrace is dumped for different issues - so what was good to them was no use to me.

Best regards
R. Grasso

Revision history for this message
Leo Arias (elopio) wrote :

Hello Robert,

The bugs that are just assigned to Ubuntu are a lot, so we are always trying to classify them. I don't really know if this is a problem on xorg or kdm, and I could have made a mistake. But assigning the most probable package is a good idea because it will call the attention to developers of that application who can guide us to find if it is an xorg problem or not. If with the traces we find that the problem is not in xorg we can always change the package :)

Please follow the instructions at [1] and attach the gdb-Xorg.txt. If you have doubts about the debugging process you can ask here or on #ubuntu-bugs and somebody will help you.
With that information, somebody with more knowledge about xorg than me can continue working on the bug with you.

And I'll add a link to [1] on [2] . Thanks for pointing it out.

[1]https://wiki.ubuntu.com/X/Backtracing
[2]https://wiki.ubuntu.com/X/Troubleshooting/Other#Problem%20results%20in%20X%20crash%20or%20exit

Revision history for this message
boblinux (robert-grasso) wrote :

Ok Leo, I understand.

I will review my gdb skills briefly :-) Good ol'days ;-)

It's getting late here in France (all this is happening at home ;-) - I am not sure I am going to setup all this this evening. More probably this week-end. I already started apport ;-) But the sooner the better.

Bryce Harrington (bryce)
affects: xorg (Ubuntu) → xorg-server (Ubuntu)
Bryce Harrington (bryce)
tags: added: kubuntu
Revision history for this message
boblinux (robert-grasso) wrote :

Hello,

So, here is my report - not exactly the one we assumed : however :

last friday evening, reading briefly the doc, I understood that attaching gdb to X was dead simple, so I tried - and X froze - completely - it was late, thus I dropped it, and resumed the debug session now.

It kept freezing. meanwhile, looking closely to the situation, I noticed that the doc mentions attaching to /usr/bin/Xorg; however, on my Ubuntu 10.04 (I have two desktops at home, one x86_64 and one i686), ps -ef|grep X shows /usr/bin/X running, not /usr/bin/Xorg; I verified more closely :

/usr/bin/X is brought by xserver-xorg, which claims :

" This package depends on the full suite of the server and drivers for the
 X.Org X server. It does not provide the actual server itself."

Really ?

whereas /usr/bin/Xorg is brought by xserver-xorg-core : and we have xserver-xorg-core-dbg, but no xserver-xorg-dbg, which can explain why, before freezing X, gdb claimed it found no symbols when attaching to X ...

comparing X and Xorg :

-rwsr-sr-x 1 root root 10,520 2010-04-09 04:07 /usr/bin/X
-rwxr-xr-x 1 root root 1,901,280 2010-07-21 15:08 /usr/bin/Xorg

two completely different executables : one with the sticky bit set, and so tiny !

This situation looks messy : don't you think ?

I just connected through VPN at work : on CentOS 5.5, it's definitely /usr/bin/Xorg - which seems consistent with today's fashion (I am an old timer, remember ? I knew X11R6 on HP-UX and SunOS, not to mention Solaris)

just to give it a try, in /etc/kde4/kdm/kdmrc, I replaced

ServerCmd=/usr/bin/X

with

ServerCmd=/usr/bin/Xorg

and restartedkdm

Revision history for this message
boblinux (robert-grasso) wrote :
Revision history for this message
boblinux (robert-grasso) wrote :

(continuation)

Oh I am sorry ! I assumed I could attach a file independently from the edit session ! apparently not ;-)

so, now, kdm started Xorg and not X (as it should be ? you tell me !)

I did not had any further luck with gdb : attaching it to Xorg, my screen froze as well - I guess I am giving some work to somebody here ...

anyway : you may compare the two gdb logs I attached : let me be accurate : X froze when I pressed Enter at the end of the "attach" line" in both cases; then I connected from my i686 desktop, killed gdb, and fortunately this released X/Xorg; it's interesting to see that with Xorg, many (but the normal ones) .so are loaded; and with X, /lib64/ld-linux-x86-64.so.2 only is loaded. To my humble point of vue, Xorg seems to be the healthy one ?

I am going to try a last time my "doomed command", so I am posting this partial comment, and will finish with a third one - stay tuned !

Revision history for this message
boblinux (robert-grasso) wrote :

Well, so far my "doomed command" did not kill X (I MUST say Xorg !!). It s not sufficient. So I am going to keep doing my home things as usual, running Xorg and not X, and we'll see.

I am sorry I could not report anything from gdb.

Anyway, did these tests already give any insights to anybody ? And what is that X/Xorg mess ? Looks like some disagreement inside some team ...

I will be in long holidays from september 1st, but I may run some tests this week if required

Revision history for this message
boblinux (robert-grasso) wrote :

I would add : on CentOS 5.5, we actually have :

lrwxrwxrwx 1 root root 4 May 18 13:18 /usr/bin/X -> Xorg

X is a LINK, and not some separate weird binary

Revision history for this message
boblinux (robert-grasso) wrote :

I did not shut down my desktops yesterday - Xorg keeps working, with kdm in the background - my doomed command does not seem to be doomed any more - these were today's news ...

Revision history for this message
boblinux (robert-grasso) wrote :

Ok, at least I understood

man Xwrapper.config

(quite Debian specific, this X wrapper, I just discovered it ! RHEL/Fedora, which I know better, don't implement it - fixing things helps you increase your knowledge !)

well, according to Debian policies, let's say that I have a dirty workaround. Now it's up to you guys.

Revision history for this message
boblinux (robert-grasso) wrote :

Hello,

I did not read correctly the instructions at once, and did not start gdb on my second desktop at first - this is why X "froze". I understood two days ago; thus I restored

ServerCmd=/usr/bin/X

in /etc/kde4/kdm/kdmrc, restarted kdm which now runs the normal X wrapper

root 31084 31080 2 Aug25 tty7 00:28:44 /usr/bin/X -br -nolisten tcp :0 vt7 -auth /var/run/xauth/A:0-LqbUrc

and from then I attached gdb on this wrapper on my second desktop.

So far, no X crash ... I may have to wait more ...

Revision history for this message
boblinux (robert-grasso) wrote :

Question : can somebody modify the page

https://wiki.ubuntu.com/X/Backtracing

and replace, in paragraph "Backtrace with gdb" :

sudo gdb /usr/bin/Xorg 2>&1 | tee gdb-Xorg.txt

with

sudo gdb /usr/bin/X 2>&1 | tee gdb-X.txt

which should be correct i our Debian/Ubuntu world ?

Revision history for this message
boblinux (robert-grasso) wrote :

Hello,

I just had a X crash, so I immediately run 'backtrace full' in gdb : here are the results :

(gdb) backtrace full
#0 0x00007fc5339465e1 in ?? ()
No symbol table info available.
#1 0x0000000002001160 in ?? ()
No symbol table info available.
#2 0x0000000001fc96b0 in ?? ()
No symbol table info available.
#3 0x0000000000000000 in ?? ()
No symbol table info available.

To me, as xserver-xorg is not shipped with its dbg counterpart, it seemed unavoidable.

I am willing to help, but how can I do here ? I followed the instructions you gave to me. Did I do another mistake ?

Can somebody tell me ?

Revision history for this message
Chris Halse Rogers (raof) wrote :

Your backtrace suggests that X is being killed with a SIGQUIT signal. We had this problem earlier in the Lucid development cycle, and it was caused by an interaction between X and plymouth.

I notice you've got a non-standard “console=tty1” option on your kernel command line. Does this behaviour persist if you use the just the standard “quiet splash” kernel command line options?

Revision history for this message
boblinux (robert-grasso) wrote :

Thanks for your post ! I just removed “console=tty1” and rebooted - unfortunately I did not write down why I set this value.

I hope you don't care that I did not restore “quiet splash” : I hate hiding the much valuable boot messages under the hood (I use to force the text mode on RHEL as well)

I attached gdb to the X wrapper again - we'll see

Revision history for this message
boblinux (robert-grasso) wrote :

I just had another problem; I was not interacting with the OS, I just came back and saw X stopped : here is the backtrace :

Program received signal SIGPIPE, Broken pipe.
0x00007f71a827e5e1 in ?? ()
(gdb) backtrace full
#0 0x00007f71a827e5e1 in ?? ()
No symbol table info available.
#1 0x0000000000baf170 in ?? ()
No symbol table info available.
#2 0x0000000000b508d0 in ?? ()
No symbol table info available.
#3 0x0000000000000000 in ?? ()
No symbol table info available.

Any suggestion ?

Revision history for this message
boblinux (robert-grasso) wrote :

well, actually the previous backtrace happened when I had troubles with my cx88_blackbird driver - I did not have any TV data at all, thus I removed it with `modprobe -r cx88_blackbird', then I started it again; it happened twice; the following one happened when I wanted to kill xawtv with Ctrl-Q :

Program received signal SIGPIPE, Broken pipe.
0x00007f71a827e5e1 in ?? ()
(gdb) backtrace full
#0 0x00007f71a827e5e1 in ?? ()
No symbol table info available.
#1 0x0000000000000000 in ?? ()
No symbol table info available.

Timo Aaltonen (tjaalton)
affects: xorg-server (Ubuntu) → nvidia-graphics-drivers (Ubuntu)
Leo Arias (elopio)
Changed in nvidia-graphics-drivers (Ubuntu):
status: Incomplete → New
Revision history for this message
dino99 (9d9) wrote :

That version is no more supported; please open a new bug report if the actual archive found version also has the same issue.

Changed in nvidia-graphics-drivers (Ubuntu):
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.