Jaunty locks up

Bug #374831 reported by Tom Oehser on 2009-05-11
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
xserver-xorg-video-ati (Ubuntu)
Undecided
Unassigned

Bug Description

Intrepid worked fine for weeks of uptime, Jaunty locks hard with no trace of why in less than 24 hours. Sample from log shows nothing in between sensord logging normal results and reboot by holding down power button, it apparently (last night) crashed circa 1:45AM :

Syslog is:

...
May 11 01:42:06 26M sensord: CPU0 Temp: 53.0 C (min = -55.0 C, max = 127.0 C)
May 11 01:42:06 26M sensord: CPU1 Temp: 53.0 C (min = -55.0 C, max = 127.0 C)
May 11 01:42:06 26M sensord: S-IO Temp: 39.0 C (min = -55.0 C, max = 127.0 C)
May 11 01:42:06 26M sensord: cpu0_vid: +0.000 V
May 11 06:58:18 26M syslogd 1.5.0#5ubuntu3: restart.
May 11 06:58:18 26M dhclient: Listening on LPF/eth0/00:0c:76:7f:e3:b4
May 11 06:58:18 26M dhclient: Sending on LPF/eth0/00:0c:76:7f:e3:b4
May 11 06:58:18 26M dhclient: Sending on Socket/fallback
May 11 06:58:18 26M kernel: Inspecting /boot/System.map-2.6.28-12-generic
May 11 06:58:19 26M dhclient: DHCPREQUEST of 192.168.1.2 on eth0 to 255.255.255.255 port 67
May 11 06:58:19 26M kernel: Cannot find map file.
May 11 06:58:19 26M kernel: Loaded 54494 symbols from 51 modules.
May 11 06:58:19 26M kernel: [ 0.000000] BIOS EBDA/lowmem at: 0009b400/0009b400
May 11 06:58:19 26M kernel: [ 0.000000] Initializing cgroup subsys cpuset
May 11 06:58:19 26M kernel: [ 0.000000] Initializing cgroup subsys cpu
May 11 06:58:19 26M kernel: [ 0.000000] Linux version 2.6.28-12-generic (buildd@rothera) (gcc version 4.3.3 (Ubuntu 4.3.3-5ubuntu4) ) #43-Ubuntu SMP Fri May 1 19:27:06 UTC 2009 (Ubuntu 2.6.28-12.43-generic)

hardware is:

tom@26M:~$ lspci
00:00.0 Host bridge: Intel Corporation E7505 Memory Controller Hub (rev 03)
00:00.1 Class ff00: Intel Corporation E7505/E7205 Series RAS Controller (rev 03)
00:01.0 PCI bridge: Intel Corporation E7505/E7205 PCI-to-AGP Bridge (rev 03)
00:02.0 PCI bridge: Intel Corporation E7505 Hub Interface B PCI-to-PCI Bridge (rev 03)
00:02.1 Class ff00: Intel Corporation E7505 Hub Interface B PCI-to-PCI Bridge RAS Controller (rev 03)
00:1d.0 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #2 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 82)
00:1f.0 ISA bridge: Intel Corporation 82801DB/DBL (ICH4/ICH4-L) LPC Interface Bridge (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801DB (ICH4) IDE Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Controller (rev 02)
00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 02)
01:00.0 VGA compatible controller: ATI Technologies Inc RV350 AS [Radeon 9550]
01:00.1 Display controller: ATI Technologies Inc RV350 AS [Radeon 9550] (Secondary)
02:1c.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04)
02:1d.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04)
02:1e.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04)
02:1f.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04)
03:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5703X Gigabit Ethernet (rev 02)
04:02.0 RAID bus controller: Silicon Image, Inc. SiI 3512 [SATALink/SATARaid] Serial ATA Controller (rev 01)
05:01.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link)

This is an IBM Intellistation Z-Pro with a Radeon 9550 video card and a Sil 3512 PCI Sata card. It was totally stable on Intrepid...

Let me know what else I can do to diagnose... I'll just not upgrade anything else to Jaunty for now...

Tom Oehser (tom-toms) on 2009-05-11
description: updated
Tom Oehser (tom-toms) on 2009-05-12
tags: added: crash freezes jackalope jaunty lockup
Scott Howard (showard314) wrote :
Changed in ubuntu:
status: New → Incomplete

It isn't crashing as in oops hex dump it is freezing as in black screen
or frozen screen with no response to anything. But I'll read the links
and see what I can do... -Tom

On Thu, May 14, 2009 at 01:42:40AM -0000, Scott Howard wrote:
> Thanks for the report, but we'll need more information. Toget information see:
> https://wiki.ubuntu.com/DebuggingXorg#Apport%20-%20Or%20debugging%20the%20easy%20way
> or
> https://help.ubuntu.com/community/DebuggingSystemCrash
>
>
> ** Changed in: ubuntu
> Status: New => Incomplete
>
> --
> Jaunty locks up
> https://bugs.launchpad.net/bugs/374831
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
"Let us do our duty in our shop or our kitchen, the market, the street, the
office, the school, the home, just as faithfully as if we stood in the front
rank of some great battle, and we knew that victory for mankind depended upon
our bravery, strength, and skill. When we do that the humblest of us will be
serving in that great army which achieves the welfare of the world."
 --Theodore Parker

Tom Oehser (tom-toms) wrote :

When it crashes:

 - Nothing is written to any logs anymore and no failures are written at crash time
 - The network is dead, ssh connections, pings, not so much :)
 - The screen is usually black. A couple of times, whatever was there (screensaver) remains visible
 - There are no hex oops type stuffs
 - The machine does not respond to keyboard input

Scott Howard (showard314) wrote :

Thanks for the info, Tom - it sounds like an X crash.

To determine the extent of the crash:
+ Does ctrl+alt+f1 take you to a console?
+ Does ctrl+alt+backspace restart X?
+ Does mouse pointer still move? [No, it doesn't]
+ Does the keyboard LED come on when hitting the CAPSLOCK key?
+ Can you ssh into the system from another computer? [No ssh, network, pings]

Can you attach the following files?
/var/log/Xorg.0.log
~/.xsession-errors
 and enable Apport as described:
https://wiki.ubuntu.com/X/Backtracing#Apport%20-%20Or%20debugging%20the%20easy%20way

I'm classifying this for xorg-server

affects: ubuntu → xorg-server (Ubuntu)
Tom Oehser (tom-toms) wrote :

Hmmm, not sure you read the earlier description carefully...

ctrl+alt+f1 as I said no keyboard input works. Including alt-sysrq-b etc.
ctrl+alt+backspace see above
mouse pointer - there has never still been a mouse pointer. About 20 crashes so far have been 18 black screen and 2 some remnant of a video buffer.
keyboard - I'll doublecheck next time but I'm pretty sure nothing like that changes at all
again, see above, there is no network, no ssh, and no pings.

I doubt it is an X bug, actually, but I guess maybe. I'll attach the files.

Tom Oehser (tom-toms) wrote :
Tom Oehser (tom-toms) wrote :
Tom Oehser (tom-toms) wrote :

I'm going to do a simple test for if it is an X bug. I'm going to do "/etc/init.d/gdm stop", and go to bed with it on a linux text console. If it stays up for a couple days using it in console mode only, it must be an X bug. If it locks up without X running, I guess it wouldn't be...

Martin Olsson (mnemo) on 2009-05-15
affects: xorg-server (Ubuntu) → xserver-xorg-video-ati (Ubuntu)
Martin Olsson (mnemo) wrote :

@Tom, leaving the computer on overnight with X turned off is a great idea. It'll be interesting to hear what the results are.

Now, if that doesn't hang you can also try to add various options to xorg.conf that makes X.org less aggressive about graphics acceleration. You can start by trying these (add the option line to "Device" section in /etc/X11/xorg.conf and reboot to make it take effect) and leaving it on over night:

  Option "NoAccel" "true"
  Option "DRI" "false"
  Option "RenderAccel" "false"

Tom Oehser (tom-toms) wrote :

Well, it made it overnight in console mode! Of course, it has made it through 8 hours of jaunty-X, (but never 24 hours of jaunty-X ...) I'll continue to leave it in text mode for another 30-40 hours to be more sure, then I'll start playing with the acceleration options and leave it in X again. Um, I notice in apt that the only version of Xorg available for Jaunty is the 7.5 thing or whatever... I sort of wish that downgrading X was an option. I _really_ don't want to install Intrepid again at this point.

Tom Oehser (tom-toms) wrote :

Guess I should probably turn off GLX, too, then if it doesn't crash in X, turn the options on one by one.

Bryce Harrington (bryce) on 2009-05-15
tags: added: freeze
Tom Oehser (tom-toms) wrote :

Welp... I think I'm up to about 20 hours of text console with dstat running, without a lockup. So I guess you are right, something in the video driver or whatever... Guess I can stick a different video card in it and the problem will go away...

Scott Howard (showard314) wrote :

Thanks for reporting this bug and any supporting documentation. Since this bug has enough information provided for a developer to begin work, I'm going to mark it as confirmed and let them handle it from here. Thanks for taking the time to make Ubuntu better!

Changed in xserver-xorg-video-ati (Ubuntu):
status: Incomplete → Confirmed
Martin Olsson (mnemo) wrote :

@Tom, note that it might still be a kernel issue even though it's significantly less likely (it might be that while X is running the kernel is exposed to a different workload and maybe that causes the kernel to exercise a few different execution paths).

FWIW, I have a box with a "RV350 AP [Radeon 9600] [1002:4150]" graphics card which didn't show any overnight lockups in jaunty and doesn't show such problems in karmic either. Your graphics card "RV350 AS [Radeon 9550]" sounds pretty similar to mine.

Trying with "NoAccel" "true" and then later also with "DRI" "false" would be very useful for making progress on this bug. If both of those hang we could also try to run the machine overnight using the "vesa" graphics driver instead of the -ati driver.

Another useful thing to investigate if whether it hangs if you boot the live CD and leave that running overnight (because this would completely rule out that there is some non-standard config file change on the box causing these problems).

Currently, I don't think this bug report has sufficient information to be able to file a good upstream bug report to the developers.

Changed in xserver-xorg-video-ati (Ubuntu):
status: Confirmed → Incomplete
Tom Oehser (tom-toms) wrote :

I need to have the machine stay up in X for a while, so I've swapped out the ATI for an NV for now, later I can make it crash again for debugging when someone is ready to help look at it, or when a driver update appears... the card that freezes btw is a Sapphire ATI Radeon 9550 R9550 256M DDR AGP V/VO/D HEATSINK PN 187-0EC20-01GSA SKU# 11032-41 which worked flawlessly with Intrepid.

I'll be reconfiguring another machine soon that I'll probably upgrade to Jaunty and put this card in just to confirm the issue in more isolation.

Tom Oehser (tom-toms) wrote :

I will try all the suggestions, I tried for a while with no DRI Accel etc. without it crashing, but not long enough to be sure. Probably tomorrow I could put it back in. For today, I need to get work done, so have stuck an NV in there for today...

Actually, I guess this is a useful test, too, if it crashes with the NV in there, it would sort of rule out the Radeon driver...

Tormod Volden (tormodvolden) wrote :

Please check https://wiki.ubuntu.com/X/Quirks#ATI%20AGP%20Mode%20Quirk and report back if it helps.

Bryce Harrington (bryce) wrote :

We're closing this bug since it is has been some time with no response from the original reporter. However, if the issue still exists please feel free to reopen with the requested information. Also, if you could, please test against the latest development version of Ubuntu, since this confirms the bug is one we may be able to pass upstream for help.

Changed in xserver-xorg-video-ati (Ubuntu):
status: Incomplete → Invalid
Ari Mujunen (ari-mujunen) wrote :

We have exactly the same kind of ATI board ("01:00.0 VGA compatible controller: ATI Technologies Inc RV350 AS [Radeon 9550]") which we can quite reliably and quickly crash in Jaunty by just taking "Preferences / Screensaver" and with "Preview" previewing some 3D-intensive screensavers; "SkyRocket" seems one of the quickest to kill the machine and turn off DVI signal to the LCD monitor.

We will experiment with 'Option "NoAccel" / "DRI" "false" / "RenderAccel"' and with 'Disable "glx"' but I'm suspecting that disabling anything which prevents screensavers from using OpenGL with direct rendering is going to help...

Tormod Volden (tormodvolden) wrote :

Ari, please experiment with AGPMode, and file a new bug if that helps. Those AGP mode quirks are specific to each hardware combination so a separate bug is better.

Tom Oehser (tom-toms) wrote :

Note, I took the lazy way out changed hardware around. I think that motherboard and video card could be available for me to mess with it if need be and time permits... but... unless my help is really needed... I've taken the "if it doesn't work when I do that, then don't do that" approach... I don't really want to spend the time to find the motherboard and card and put them with a power supply and hard drive and install Ubuntu on the combination... es-pecially now that someone else has it recreated...

Ari Mujunen (ari-mujunen) wrote :

I fiddled with AGPMode and other settings (and did not file a new bug since nothing helped).

Model: MicroStar MS-7030 motherboard
Host bridge [0600]: nVidia Corporation nForce3 250Gb Host Bridge [10de:00e1] (rev a1)
01:00.0 VGA compatible controller [0300]: ATI Technologies Inc RV350 AS [Radeon 9550] [1002:4153]
Card Subsystem: Club-3D BV Device [196d:1018]
HW Changes: none known
BIOS: Phoenix 6.00 PG Date: 08/26/2004, has AGP 3.0 Mode Auto/4x/4x8x, tried all of them.

Disabling DRI with:
---
Section "Module"
        Disable "dri"
EndSection
---
and nothing else in xorg.conf obviously works, crashes go away, but the frame rate of gnome-screensaver module "SkyRocket" of course drops to 2--4seconds per frame.

With BIOS AGP 3.0 Mode at Auto the driver detects it as 8x, and the "SkyRocket" module crashes within 10 seconds of its full-screen preview, when it is firing the first fireworks rocket... :-)

With:
---
Section "Device"
       Option "AGPMode" "4"
EndSection
---
as the only contents of xorg.conf, "SkyRocket" module still crashes in the same way and as quickly.
Slower AGPModes (2, 1) are rejected by the driver with this card.

With:
---
Section "Device"
       Option "BusType" "PCI"
       Option "AGPMode" "1"
EndSection
---
the crash stays essentially the same, maybe it lives longer for a second or two.

The above were repeated with Devie/Monitor/Screen sections added so that 'Virtual 1920 1080' can be defined; the driver defaults to:
---
(II) RADEON(0): Max desktop size set to 2560x1200
(II) RADEON(0): For a larger or smaller max desktop size, add a Virtual line to your xorg.conf
(II) RADEON(0): If you are having trouble with 3D, reduce the desktop size by adjusting the Virtual line to your xorg.conf
---
Adjusting the Virtual size makes no difference, still crashes.

I think I'm just going to give up since the user of this machine has previously used IceWM environment and is not actually expecting any 3D...after switching over to GNOME it was just easy experiment with nice 3D screensavers... I've left the xorg.conf at 'Disable "dri"' so for example the screensaver modules still "work", albeit slowly.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers