[natty] system freezes on boot without disabling KMS

Bug #735126 reported by LGB [Gábor Lénárt]
86
This bug affects 11 people
Affects Status Importance Assigned to Milestone
xserver-xorg-driver-ati
Fix Released
High
linux (Ubuntu)
Fix Released
High
Tim Gardner
Natty
Fix Released
High
Tim Gardner
xserver-xorg-video-ati (Ubuntu)
Invalid
Medium
Unassigned
Natty
Invalid
Medium
Unassigned

Bug Description

Binary package hint: xserver-xorg-video-ati

After upgrading maverick (x86 32 bit system but with PAE kernel) system does not boot, after some disk activity the upper half of the screen is filled with garbage and no more disk activity. Booting an older kernel (from maverick) works. Also it work with natty's kernel if I give radeon.modeset=0 parameter at grub but then there is now hw accel. (glxinfo reports software rendering, booting natty with maverick's kernel seems to have it, glxinfo reports galium as the render string then).

This is a Toshiba Satellite A300 notebook with 4G of RAM and video hw:

01:00.0 VGA compatible controller: ATI Technologies Inc Mobility Radeon HD 3400 Series

xserver-xorg-video-radeon 1:6.14.0-0ubuntu2

Linux orion 2.6.38-6-generic-pae #34-Ubuntu SMP Tue Mar 8 15:47:54 UTC 2011 i686 i686 i386 GNU/Linux

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: xserver-xorg-video-radeon 1:6.14.0-0ubuntu2
ProcVersionSignature: Ubuntu 2.6.38-6.34-generic-pae 2.6.38-rc7
Uname: Linux 2.6.38-6-generic-pae i686
Architecture: i386
CompizPlugins: [core,bailer,detection,composite,opengl,compiztoolbox,decor,regex,mousepoll,resize,wall,grid,move,animation,place,snap,session,imgpng,workarounds,vpswitch,gnomecompat,expo,ezoom,staticswitcher,fade,scale]
CompositorRunning: None
Date: Mon Mar 14 22:03:35 2011
DistUpgraded: Log time: 2011-03-13 16:50:20.128908
DistroCodename: natty
DistroVariant: ubuntu
DkmsStatus:
 vboxhost, 4.0.0, 2.6.35-24-generic-pae, i686: installed
 vboxhost, 4.0.0, 2.6.35-28-generic-pae, i686: installed
 vboxhost, 4.0.0, 2.6.35-25-generic-pae, i686: installed
 vboxhost, 4.0.0, 2.6.35-27-generic-pae, i686: installed
 vboxhost, 4.0.0, 2.6.35-26-generic-pae, i686: installed
GraphicsCard:
 ATI Technologies Inc Mobility Radeon HD 3400 Series [1002:95c4] (prog-if 00 [VGA controller])
   Subsystem: Toshiba America Info Systems Device [1179:ff1e]
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release i386 (20101007)
MachineType: TOSHIBA Satellite A300
ProcEnviron:
 LANGUAGE=en_US:en
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.38-6-generic-pae root=UUID=14c446e2-e9e0-47a2-992e-f04d3f9b7bf3 ro quiet splash vt.handoff=7 radeon.modeset=0
Renderer: Software
SourcePackage: xserver-xorg-video-ati
UpgradeStatus: Upgraded to natty on 2011-03-13 (1 days ago)
dmi.bios.date: 03/20/2009
dmi.bios.vendor: INSYDE
dmi.bios.version: 1.90
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: Portable PC
dmi.board.vendor: TOSHIBA
dmi.board.version: Base Board Version
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: Chassis Manufacturer
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnINSYDE:bvr1.90:bd03/20/2009:svnTOSHIBA:pnSatelliteA300:pvrPSAGCE-0KC00FHU:rvnTOSHIBA:rnPortablePC:rvrBaseBoardVersion:cvnChassisManufacturer:ct10:cvrChassisVersion:
dmi.product.name: Satellite A300
dmi.product.version: PSAGCE-0KC00FHU
dmi.sys.vendor: TOSHIBA
version.compiz: compiz 1:0.9.4-0ubuntu4
version.libdrm2: libdrm2 2.4.23-1ubuntu3
version.libgl1-mesa-glx: libgl1-mesa-glx 7.10.1-0ubuntu1
version.xserver-xorg: xserver-xorg 1:7.6~3ubuntu11
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:6.14.0-0ubuntu2
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.14.0-4ubuntu1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:0.0.16+git20110107+b795ca6e-0ubuntu5

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :
Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

Unfortunately, I can't send bug report related to the exact environment where bug occurs since system does not boot then at all. So this bug report was made when radeon.modeset=0 was used. Also, I couldn't find any usable information in kernel or Xorg log files ...

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

I also tried the "safe mode" I can see the boot messages then with natty's kernel, but there is only a black screen some seconds later (imho at starting X itself ....).

Revision history for this message
Bryce Harrington (bryce) wrote :

If you plug in an ethernet cable and then boot, are you able to ssh in? If so, you can gather the log files that way. If not, then you may have a deeper kernel bug at work.

Without knowing the errors being encountered it's hard to guess what might be wrong; I have two -ati systems I regularly test natty on and neither freezes in this fashion, so I don't reproduce the problem myself.

Changed in xserver-xorg-video-ati (Ubuntu):
status: New → Incomplete
bugbot (bugbot)
tags: added: freeze
Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

No, and even no MAC address can be found in the ARP table, so I guess networking is not on yet before the freeze, or so? Now there was an update for kernel 2.6.38-7 (instead of -6) but the same problem. Btw, it's nice to mention that magic sysreq keys seems to work (at least alt-sysreq-b reboots the system) but I can't see any message (because of the video mode on the console?) so it's not so useful, I've tried to sync, unmount and reboot with the corresponding magic sysreq combos, but still, after I boot with maverick's kernel, still there is no sign of _any_ log message of the natty's kernel, or xorg logs at all what I would be able to send you. So I have no idea how this problem can be reported with more info, maybe network console can be useful, but I've only read about that, I haven't ever tried to use it yet ... Is it worth to try?

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

Ok, I've booted with the old (maverick's kernel) and I hacked a new script as the first one in /etc/rcS.d which modprobes the network console module and also brings up the network and setting the log level of the console. Then I rebooted with natty's current kernel, the usual effect: half of the screen is filled with garbage, no disk activity, etc. But I got some messages on my other machine with network console: At first it did not seems to contain any problematic message (panic, etc). However, after some time, there were some "blocked for more than ..." like messages, it seems about commands used during the boot (iptables-restore: I had setup for iptables rules to load with, NetworkManager, etc ...). I've attached the log file to this comment. Hope it helps.

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

Just adding the "radeon.modeset=0" to the natty's kernel's parameters causes the system to be able to boot (into a usable state, with X etc, of course not the same 3D performance etc), and not messages like the previous one with KMS not disabled. So for me, it seems only using or not using KMS cause this kernel bug? If it's that ... I'm also attaching the kernel log I got now with "radeon.modeset=0" with natty's kernel (Linux orion 2.6.38-7-generic-pae #36-Ubuntu SMP Fri Mar 18 22:23:27 UTC 2011 i686 i686 i386 GNU/Linux). I've also got this one with network console grabbed on another machine, so maybe it's fair enough now to compare with the previous.

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

I am sure I am not the right person to judge, but is it possible that it's not even az Xorg server related bug but some kind of kernel problem when KMS is enabled (the default)? I have no Xorg log at all, and I have the suspect that system haven't even reached the point to start Xorg (but I am not sure) just check the netconkern.log file what I've sent: processes stuck used in the boot process, probably before Xorg has the chance to start ...

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

Please tell me, if I can do more to debug the problem, I really would like to test natty on that notebook (and at our firm we have usually this Toshiba notebook for more people, though they are not so keen on doing anything extra for a working system, sadly). Thanks!

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

I've also tried the generic, non-PAE kernel, also there was some update for natty, but the problem remains as it was.

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

I've changed the status from incomplete to new, since I gave some data which was asked.

Changed in xserver-xorg-video-ati (Ubuntu):
status: Incomplete → New
Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

Installing fglrx allows me to use natty's own kernel, but unity is unusable then, only black boxes can be seen instead of the "panel", the launcher etc. Classic session works.

Revision history for this message
Bryce Harrington (bryce) wrote :

Hi LGB, thanks for persisting with the additional testing, and figuring out how to work around the network issue.

Looking at netconkern.log, I agree with your assessment that the failure occurs well before X starts up. The log is a bit confusing as to what exactly is failing, however it appears that drm initializes (successfully apparently), at 19 sec, and then failures begin to occur when networking is set up:

[ 79.112037] ieee80211 phy0: Failed to initialize wep: -110
[ 243.716037] INFO: task jbd2/sda1-8:263 blocked for more than 120 seconds.

Then various parts of the kernel start failing with errors relating to I guess memory errors?

It is quite interesting that toggling KMS on or off changes the behavior. This makes me wonder if perhaps the issue is that the drm module is leaving the kernel in an inconsistent state, such that subsequent modules fail.

I've added a task for the kernel, since this appears to be a kernel drm bug of some sort.

Revision history for this message
Bryce Harrington (bryce) wrote :

Btw, I notice you had a virtualbox installation failure. You might want to try removing that from your system to eliminate it as a potential cause of this problem. (Since it carries its own X stuff, it can sometimes cause interferences, although it's probably innocent in this case.)

Revision history for this message
Bryce Harrington (bryce) wrote :

Gábor Lénárt - I've forwarded this bug upstream to http://bugs.freedesktop.org/show_bug.cgi?id=36007 - please subscribe yourself to this bug, in case they need further information or wish you to test something. Thanks ahead of time!

Changed in xserver-xorg-video-ati (Ubuntu):
status: New → Triaged
Revision history for this message
Bryce Harrington (bryce) wrote :

"Please tell me, if I can do more to debug the problem, I really would like to test natty on that notebook"

LGB, upstream asks, "The crash doesn't look related to radeon at all. Can you blacklist radeon, boot to runlevel 3 and then manually load it from the console?"

Can you give that a shot?

Changed in xserver-xorg-video-ati (Ubuntu):
status: Triaged → Incomplete
Changed in xserver-xorg-driver-ati:
importance: Unknown → High
status: Unknown → Confirmed
Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

Hi, thanks for the information, I've just added myself to the cc list on the freedesktop bug tracker. I will investigate, sorry for the delay, just I had no access to my notebook till now. I will try some things today.

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

Maybe I've missed something: I've tried to blacklist radeon module, so I've created /etc/modprobe.d/blacklist-radeon.conf with "blacklist radeon" on it, then I did depmod -a then dpkg-reconfigure linux-image-`uname -r` (I used the freshest natty kernel booted with radeon.modeset=0 before to allow me to use the system). After reboot it still bugs. Does it signal a deeper problem, or should I blacklist radeon module in a different way? Unfortunately I can't check the kernel output now unlike before (with netconsole), I just have black screen and non disk activity after a while on booting. Please tell me how I could do this better, if my solution was not correct at least. Thanks!

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

MTRR settings:
reg00: base=0x0fffe0000 ( 4095MB), size= 128KB, count=1: write-protect
reg01: base=0x000000000 ( 0MB), size= 2048MB, count=1: write-back
reg02: base=0x080000000 ( 2048MB), size= 1024MB, count=1: write-back
reg03: base=0x0bff00000 ( 3071MB), size= 1MB, count=1: uncachable
reg04: base=0x100000000 ( 4096MB), size= 1024MB, count=1: write-back
kernel: Linux orion 2.6.38-8-generic-pae #41-Ubuntu SMP Tue Apr 5 21:14:26 UTC 2011 i686 i686 i386 GNU/Linux
kernel cmdline: BOOT_IMAGE=/boot/vmlinuz-2.6.38-8-generic-pae root=UUID=14c446e2-e9e0-47a2-992e-f04d3f9b7bf3 ro radeon.modeset=0 text quiet splash vt.handoff=7
I also attached the output of dmesg.
I only comment this, since upstream suggests the inspection of mtrr settings.

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

By the way I've tried radeon.blacklist=yes kernel parameter as it was suggested according to some google'd resources. Then radeon did not load indeed, and system booted (checked with lsmod | grep radeon, no result). modprobe'ing radeon did not work then ("illegal parameter" about that radeon.blacklist). When I used insmod with the full path of radeon.ko, screen went black immediately, the same result as my intial bug report.

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

I've forgot to mention: I've removed the remained virtualbox stuffs totally, also dkms, etc. Still, there is no change about the issue. I've also tried to limit the available memory for the kernel with mem=512M kernel parameter. I've also tried "nopat" I've found in some forum, but still the same problem.

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

I've also tried some odd thing: using "mem=512M nopat radeon.blacklist=yes text" extra kernel parameters, for sure, I got text console well then (even without mem/nopat it works). Then I removed MTRR entries one by one (echo "disable=X" >| /proc/mtrr, X=0 ...). System went slow like hell, never seen before, but I guess it's natural after this "action". Then I tried to insert radeon.ko with insmod by giving its full path, but still the result is the very same again :( :( Sorry for writing too much, tell me if I don't need to try and comment every ideas of mine. Anyway I'm waiting for some suggestion now ...

Revision history for this message
PromoGest (m3nt0r3) wrote :

same problem here with debian sid with some experimental ( mesa, dri, ati ), on 2.6.38.2 liquorix. If i boot normallly i have no video, no tty but i have ssh remote login but if i try to restart gdm or manually start X it freeze all.
thanks Gàbor.

Revision history for this message
PromoGest (m3nt0r3) wrote :

if i try to switch to another tty it freeze eth. i am with mesa driver now, fglrx works too

Revision history for this message
PromoGest (m3nt0r3) wrote :

workaround: edit /etc/modprobe.d/radeon-kms.conf and setting radeon.modeset=0 , no acceleration but it starts

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

Yes, that disables kms. I am still wondering about this "kms" thing: if kernel mode setting is not in the kernel but done by the X Server or so, what's the deal? Why make it acceleration disappear? I would be happy to say bye-bye to KMS (and boot loge), let leave X to change video mode (instead of kernel) but I want acceleration which should be not related to setting video mode up. Or I miss something :) Surely, the best would be to do by kernel at once etc, but if it does not work for me, I can live without it, but it does not work that way it seems [no acceleration] :(

Revision history for this message
PromoGest (m3nt0r3) wrote :

and i have no audio, no acceleration and my sound card disappeared

Revision history for this message
PromoGest (m3nt0r3) wrote :

i added radeon in /etc/modules and now it starts with modeset=1. it seems to have glx enabled but no 3D runing desktop. The strange thing is that i have no sound card detected now.

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

I've tried what Alex told on freedesktop's bugtracker (please take a look there) so it really seems it's nothing about X itself, just loading the module without X involved the freeze happens.

Revision history for this message
PromoGest (m3nt0r3) wrote :

i tried too and for me it works just for 50%, i have no black screen but no 3D ( gnome-shell start in "classic" mode") and still no sound card ( i have to modprobe manually ) . System is usable but it goes slower after some hours ...

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

I have started to try different versions of vanilla kernels please take a look at freedesktop.org bugtracker

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

Bisect process is - finally - done, so I have the first bad commit from Linus' GIT between 2.6.37 and 2.6.38 which seems to introduce the problem.

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

It seems the bug is identified also a short patch (at fdo bugtracker) which cures the problem. Today I will try to apply that patch to the ubuntu's kernel source tree to see that it really fixes the freeze.

Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

According to my tests, the suggested commit really fixes the problem, so I would suggest to include that patch within natty's kernel too:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=patch;h=97ea530f6fac1f9632b0c4792a2a56411454adbe

Changed in xserver-xorg-driver-ati:
status: Confirmed → Invalid
Revision history for this message
Bryce Harrington (bryce) wrote :

Looks like you did some solid work bisecting the issue down and identifying the patch that caused the regression. Since it points to a kernel patch, the kernel team should take it from here.

Changed in xserver-xorg-video-ati (Ubuntu):
importance: Undecided → High
status: Incomplete → Triaged
Revision history for this message
Bryce Harrington (bryce) wrote :

[Original upstream bug was https://bugs.freedesktop.org/show_bug.cgi?id=36007 and has the bisection search information.]

Changed in xserver-xorg-driver-ati:
importance: High → Unknown
status: Invalid → Unknown
Changed in linux (Ubuntu):
assignee: nobody → Jeremy Foshee (jeremyfoshee)
importance: Undecided → High
milestone: none → ubuntu-11.04
status: New → Triaged
Revision history for this message
Bryce Harrington (bryce) wrote :

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=97ea530f6fac1f9632b0c4792a2a56411454adbe

Upstream kernel patch to fix this issue.

JFo, please add to the kernel team's queue.

Changed in xserver-xorg-video-ati (Ubuntu):
status: Triaged → In Progress
Changed in xserver-xorg-driver-ati:
importance: Unknown → High
status: Unknown → Confirmed
tags: added: kernel-key
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

I have this on our list now, thanks Bryce. Andy has said he will review it tomorrow. It should make it to us through stable updates, but I have asked for a review ahead of that to be sure it will make it in time.

~JFo

Changed in linux (Ubuntu):
assignee: Jeremy Foshee (jeremyfoshee) → nobody
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Flip-flopped the milestone as it looks like this is queued to enter the kernel prior to release, so it should remain for the natty milestone.

Apologies for any confusion.

~JFo

Changed in linux (Ubuntu Natty):
milestone: ubuntu-11.04 → natty-updates
milestone: natty-updates → ubuntu-11.04
Revision history for this message
Martin Pitt (pitti) wrote :

Closing the userspace driver task, as this needs to be fixed on the kernel side.

Regarding the milestone, I thought the kernel was now frozen for natty? (Not that I would object to getting this fixed prior to release, as this will presumably also affect the live system, and thus block installation).

Changed in xserver-xorg-video-ati (Ubuntu Natty):
importance: High → Medium
status: In Progress → Invalid
Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

And even if it does not block the installation for some reason, the installed system would not boot because of freezing on boot with KMS enabled. Though I only tried this with maverick->natty upgrade, but I guess it does not count at all, if it's an upgraded system, live system, etc, if it's kernel 2.6.38 it will freeze on radeon module, since afaik KMS is enabled by default in all cases and upstream only fixes this problem later, and it's not in natty's kernel yet. So I guess is affects every possible of usage/installation/upgrade method of natty on hardwares where it's issue - according to bug reports (including mine) it's at least Toshiba Satellite notebooks ....

Revision history for this message
Tim Gardner (timg-tpi) wrote :

The patches for this bug are included in 2.6.38.3, but unfortunately will not make it into the initial release kernel. You can load 2.6.38.3 from https://launchpad.net/~kernel-ppa/+archive/pre-proposed.

Changed in linux (Ubuntu Natty):
assignee: nobody → Tim Gardner (timg-tpi)
status: Triaged → Fix Committed
milestone: ubuntu-11.04 → natty-updates
Revision history for this message
LGB [Gábor Lénárt] (lgb) wrote :

Not exactly a great surprise but I would like to note that indeed, using that PPA fixes my problem and works fine. Thanks.

Changed in xserver-xorg-driver-ati:
status: Confirmed → Fix Released
Revision history for this message
ndstate (ndstate) wrote :

Would anyone be able to verify that this fix will also work for the issues reported at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/763909 ?

Thanks

Revision history for this message
Julian Wiedmann (jwiedmann) wrote :

For Natty, this should be fixed now with 2.6.38-10.46.
Please reopen the Natty task if you still experience this issue.

Changed in linux (Ubuntu Natty):
status: Fix Committed → Fix Released
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Patch has been applied and uploaded for Oneiric as well. Marking the actively developed linux task from Fix Committed to Fix Released. Thanks.

ubuntu-oneiric$ git describe --contains 97ea530f6fac1f9632b0c4792a2a56411454adbe
v2.6.39-rc2~8^2~6

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.