Lucid: system freezes randomly possibly due to bug on radeon drm module

Bug #567805 reported by Daniel D.
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

With kernel 2.6.32-21.32 I consistently get system freezes while restoring data from a remote tape backup using amanda. The tape drive is on a server, and I access it from the affected workstation using amrecover.

The restore causes heavy network (max for 100Mbit network which is what I have) and disk (which is a linux software RAID5 array on three SATA-II hard drives connected to a Silicon Image fakeRAID controller (not using the fakeRAID).

After about 30 minutes of this, the system reports different kernel related crashes. CPU is not maxed on either CPU until the crash (i.e. during the normal part of the restore the CPU is 30-40% on both CPU, but when the system crashes one CPU is stuck at 100%)

Using the mainline kernel 2.6.34-0206034rc5 there is no problem.

Which kernel is the final Lucid release going to use?

Also I had no problems on Karmic Koala.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-image-2.6.32-21-generic 2.6.32-21.32
Regression: Yes
Reproducible: Yes
ProcVersionSignature: Ubuntu 2.6.32-21.32-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32-21-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: daniel 2212 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'NVidia'/'HDA NVidia at 0xfc000000 irq 20'
   Mixer name : 'Realtek ALC883'
   Components : 'HDA:10ec0883,1458c601,00100002'
   Controls : 40
   Simple ctrls : 22
Date: Wed Apr 21 06:14:31 2010
EcryptfsInUse: Yes
HibernationDevice: RESUME=UUID=f1947840-cfe8-45c4-98b1-ff75dc756947
InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Beta amd64 (20100406)
IwConfig:
 lo no wireless extensions.

 eth0 no wireless extensions.
MachineType: Gigabyte Technology Co., Ltd. M61PME-S2P
ProcCmdLine: BOOT_IMAGE=/vmlinuz-2.6.32-21-generic root=/dev/mapper/hostname-root ro quiet splash
ProcEnviron:
 LANG=en_CA.utf8
 SHELL=/bin/bash
RelatedPackageVersions: linux-firmware 1.34
RfKill:

SourcePackage: linux
dmi.bios.date: 12/30/2008
dmi.bios.vendor: Award Software International, Inc.
dmi.bios.version: F2
dmi.board.name: M61PME-S2P
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.type: 3
dmi.chassis.vendor: Gigabyte Technology Co., Ltd.
dmi.modalias: dmi:bvnAwardSoftwareInternational,Inc.:bvrF2:bd12/30/2008:svnGigabyteTechnologyCo.,Ltd.:pnM61PME-S2P:pvr:rvnGigabyteTechnologyCo.,Ltd.:rnM61PME-S2P:rvrx.x:cvnGigabyteTechnologyCo.,Ltd.:ct3:cvr:
dmi.product.name: M61PME-S2P
dmi.sys.vendor: Gigabyte Technology Co., Ltd.

Revision history for this message
Daniel D. (cshoredaniel-deactivatedaccount) wrote :
Revision history for this message
Daniel D. (cshoredaniel-deactivatedaccount) wrote :

Two corrections. The freeze also occurs with the mainline kernel, and is not related to system load on the network or disk. It appears in fact to be an error in the radeon drm module.

The following messages were filling the syslog and scrolling rapidly when I switched to VT1.

pr 23 05:00:56 daniloth kernel: [13267.008713] [drm:radeon_cs_ioctl] *ERROR* Faild to sche
dule IB !
Apr 23 05:00:56 daniloth kernel: [13267.010063] [drm:radeon_ib_schedule] *ERROR* radeon: co
uldn't schedule IB(14).
Apr 23 05:00:56 daniloth kernel: [13267.010070] [drm:radeon_cs_ioctl] *ERROR* Faild to sche
dule IB !
Apr 23 05:00:56 daniloth kernel: [13267.023439] [drm:radeon_ib_schedule] *ERROR* radeon: co
uldn't schedule IB(15).
Apr 23 05:00:56 daniloth kernel: [13267.023448] [drm:radeon_cs_ioctl] *ERROR* Faild to sche
dule IB !
Apr 23 05:00:56 daniloth kernel: [13267.052651] [drm:radeon_ib_schedule] *ERROR* radeon: co
uldn't schedule IB(0)

tags: removed: needs-upstream-testing
summary: - Lucid: system freezes under heavy network+disk - mainline kernel
- 2.6.34-rc5 works
+ Lucid: system freezes randomly possibly due to bug on radeon drm module
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi Daniel,

If you could also please test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-upstream-testing
tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Daniel D. (cshoredaniel-deactivatedaccount) wrote :

Syslog before crash. I'm currently using the latest Ubuntu kernel (as of May 6). I tried the upstream kernel when I first reported. Will try the lastest upstream shortly but I doubt it's been solved.

Also note that the hang does not always happen the same way. Sometimes it's with the radeon error message and it is possible to switch VT's to initiate a restart, and sometimes the system is totally frozen, with not network or local console possible.

Revision history for this message
Daniel D. (cshoredaniel-deactivatedaccount) wrote :

Here is a syslog of a crash with the mainline kernel. This crash was one in which I was unable to switch VT's.

tags: removed: needs-upstream-testing
Revision history for this message
Daniel D. (cshoredaniel-deactivatedaccount) wrote :

I have confirmed this is not faulty hardware. I have used a similar video card with the same result. (That is I orginally reported with an ATI Radeon X1900 and thought it might be hardware so tried and ATI X1950).

Also, turning off all desktop effects stops the bug from occurring (probably because the drm module isn't being used much as a result).

Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

This bug report was marked as Incomplete and has not had any updated comments for quite some time. As a result this bug is being closed. Please reopen if this is still an issue in the current Ubuntu release http://www.ubuntu.com/getubuntu/download . Also, please be sure to provide any requested information that may have been missing. To reopen the bug, click on the current status under the Status column and change the status back to "New". Thanks.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-expired
Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.