fglrx kernel module crashes system hard during hardy to intrepid upgrade

Bug #278963 reported by to be removed
10
Affects Status Importance Assigned to Milestone
fglrx-installer (Ubuntu)
Fix Released
Critical
Unassigned
Intrepid
Fix Released
Critical
Unassigned
linux-restricted-modules-2.6.24 (Ubuntu)
Invalid
Undecided
Unassigned
Intrepid
Invalid
Undecided
Unassigned

Bug Description

Binary package hint: linux-restricted-modules-2.6.24-21-generic

The fglrx kernel module seems to crash my desktop machine during an upgrade from hardy to intrepid.

The machine: amd64 (in 64-bit mode), with an ATI graphics card. I use the free radeon X driver, not fglrx. However, the restricted modules are installed and restricted is in sources.list since that's the default.

Steps to reproduce:
- start with up-to-date hardy system
- update-manager -d -c (or do-release-upgrade or apt-get dist-upgrade)
- this starts the upgrade to intrepid
- at some point, the fglrx kernel module gets loaded into the kernel
- a little after that the system crashes, hard, not even ping works anymore, and the graphics screen (showing gdm) gets garbled
- fglrx was not loaded originally into the kernel
- X does not seem to be restarted during the upgrade
- system has not booted during the upgrade
- root filesystem is so corrupted it no longer booted, but fsck (on another machine) fixes it somewhat

If I remove the restricted kernel module packages, and remove restricted from sources.list, the upgrade works fine.

As far as I can determine, the version of fglrx that gets loaded is the hardy one, not the one in intrepid, based on the date that fglrx outputs to syslog (attached). I will attach also dpkg.log, xorg.conf, and Xorg.0.log and Xorg.0.log.old, from the crashed machine.

My guess is that something in the upgrade triggers fglrx module loading. This may or may not be related to switching to DKMS in the intrepid version.

I can, if necessary, re-run the upgrade. The system is installed onto a USB memory stick, and I have dd copies of the stick from before the upgrade, and from the (first) crashed system. However, the upgrade takes around 12 hours of wall-clock time (the USB stick is slow), so it is inconvenient for me to do that, but if it is necessary, I will. I can, of course, easily look at or provide any files you may need to debug, from either the pre-installed or the crashed system.

Revision history for this message
to be removed (liw) wrote :
Revision history for this message
to be removed (liw) wrote :
Revision history for this message
to be removed (liw) wrote :
Revision history for this message
to be removed (liw) wrote :
Revision history for this message
to be removed (liw) wrote :
Revision history for this message
to be removed (liw) wrote :
Revision history for this message
to be removed (liw) wrote :
Revision history for this message
to be removed (liw) wrote :
Bryce Harrington (bryce)
Changed in fglrx-installer:
importance: Undecided → Critical
status: New → Confirmed
Revision history for this message
Bryce Harrington (bryce) wrote :

There is a new working fglrx uploaded today for fglrx-installer. It might be of interest to re-try the upgrade at this point, to see if the old fglrx that caused the problem gets purged and replaced now, and thus prevents the issue. So now would be a good time to re-test this bug.

Revision history for this message
Alberto Milone (albertomilone) wrote :

Lars:
I know that it will take a lot of time but I would be glad if you could follow these steps from your backup:
1) upgrade only the linux-restricted-modules manually
2) run the dist-upgrade from Update Manager and see if you can reproduce the problem

I believe that doing so will remove any fglrx.ko and nvidia.ko from your system (since we don't ship with these precompiled modules any longer) thus solving the problem. If I'm right then we could hack Update Manager into upgrading the linux-restricted-modules before any other package.

Let me know (and thanks a lot for your time in advance)

Revision history for this message
Brian Watson (vertexoflife) wrote :

I can confirm this bug still persists while using just the above steps (linux-restricted-modules upgrade&update manager) however, this might be related to my problems with xserver-xorg-input-all.

Revision history for this message
to be removed (liw) wrote :

I've now done an upgrade following Alberto's instructions and did it successfully. I will next do it again without upgrading l-r-m manually.

I had to actually install l-r-m on the gytha system before upgrading. I only had a version specific version of it installed.

Revision history for this message
Alberto Milone (albertomilone) wrote :

Brian: did you have the fglrx driver installed or do you have evidence that the fglrx driver was loaded during the upgrade?

Lars: thanks for the update. Let me know how the next dist-upgrade goes

Revision history for this message
Brian Watson (vertexoflife) wrote :

I did have the fglrx driver installed.

Revision history for this message
Alberto Milone (albertomilone) wrote :

Brian: ok, then it's not the same problem as the one that Lars reported since he wasn't using fglrx and the module got loaded somehow.

Revision history for this message
Henrik Nilsen Omma (henrik) wrote :

I tried to reproduce this by upgrading from Hardy (updated) to Intrepid on a system with Radeon Xpress 200. The upgrade completed without issue.

fglrx was not installed before the upgrade. I installed it after the upgrade and it also worked fine.

Next I'll try installing fglrx _before_ the upgrade.

Revision history for this message
to be removed (liw) wrote :

An upgrade from an up-to-date hardy to intrepid on the system that used to crash now succeeds. fglrx does not get loaded during the upgrade, or after.

Revision history for this message
Steve Langasek (vorlon) wrote :

marking as invalid for lrm-2.6.24 in intrepid since this package is not present in that release; if this is an lrm-2.6.24 bug, a hardy nomination is needed.

Changed in linux-restricted-modules-2.6.24:
status: New → Invalid
Revision history for this message
Henrik Nilsen Omma (henrik) wrote :

My second upgrade test, this time with fglrx installed before the upgrade also went fine. Compiz was also running.

Sorry if this is noise :)

Revision history for this message
Alberto Milone (albertomilone) wrote :

Lars: this is good news. At least we have a workaround now and we can use it in Update Manager.

Henrik: thanks for your feedback. I have yet to figure out what can modprobe fglrx at random.

Steve: right.

Brian: can you try the dist-upgrade again (so as to reproduce the problem) and attach your /var/log/Xorg.0.log and your /var/log/Xorg.0.log.old?

Revision history for this message
Brian Watson (vertexoflife) wrote :

Alberto: My problems have been traced to a problem with another package and a corruption in one of my files. False alarm sorry. Dist-upgrades probably won't work for me, because there is a trange xorg-input-all bug going on, but I will try a CD installation later in the week.

Revision history for this message
Bryce Harrington (bryce) wrote :

Okay, based on everyone's feedback it appears this issue is a thing of the past, so I'm closing the bug as fixed. Thanks everyone for testing!

Changed in fglrx-installer:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.