Release upgrade fails due to unsupported DKMS modules

Bug #2020406 reported by Juerg Haefliger
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-release-upgrader (Ubuntu)
Triaged
Medium
Unassigned
Lunar
Won't Fix
Medium
Unassigned

Bug Description

The kernel team takes care that any DKMS package supported by the new release actually compiles and works with the new kernel. We also make sure that DKMS packages that are obsoleted in the new release are updated in the current release such that they are ignored on upgrade.

However users can have old and/or unsupported DKMS packages installed from different sources, not just the Ubuntu archive, which can be problematic. If kernel header packages are installed, the dkms post-install hook tries to compile all installed and enabled DKMS modules. On upgrade, header packages for the new kernel are installed and the dkms hook is invoked. If any of the installed and enabled DKMS modules fails to build for the new kernel, the header package installation fails and ultimately the whole release upgrade fails.

To get around this problem, we propose to disable all DKMS modules before attempting the upgrade and re-enable them one by one afterwards and notify the user if any of them failed to build. Care must be taken that when disabling the DKMS modules, the initrds are *not* rebuilt (otherwise the disabled DKMS modules might be removed from the initrds) so that the user can fall back to a previous kernel/initrd should the new kernel not work for them.

ATM, there are 46 bugs logged for the 'linux' package because of DKMS build failures on upgrade to Lunar: https://bugs.launchpad.net/ubuntu/+source/linux/+bugs?field.tag=lunar-upgrade-dkms-failure&field.omit_dupes.used=&field.status%3Alist=NEW&field.status%3Alist=INVALID&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.status%3Alist=FIXCOMMITTED&field.status%3Alist=INCOMPLETE_WITH_RESPONSE&field.status%3Alist=INCOMPLETE_WITHOUT_RESPONSE

Related branches

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

The proposal is similar to how we deal with PPAs on upgrades. At the upgrade start we disable PPAs, tell user we do this, and then do not re-enable them after the install is completed.

We should do the same with dkms: disable dkms autoinstall, and at the end of the upgrade re-enable autoinstall without attempting to configure/build/enable any dkms packages.

In practical terms it means, at the start of the upgrade we should `touch /etc/dkms/no-autoinstall` and at the end of the upgrade `rm -f /etc/dkms/no-autoinstall`.

Revision history for this message
Nick Rosbrook (enr0n) wrote :

It sounds like you each have different proposals here - is that right? AIUI, Dimitri is suggesting that we just disable dkms autoinstall all together during the upgrade, without trying to rebuild DKMS modules at the end. OTOH, Juerg suggest trying to rebuild them one at a time at the very end of the upgrade.

Changed in ubuntu-release-upgrader (Ubuntu):
status: New → Confirmed
tags: added: foundations-todo
Revision history for this message
Juerg Haefliger (juergh) wrote :

I think we should try to re-enable them afterwards. Some of them might be legit and supported.

Nick Rosbrook (enr0n)
Changed in ubuntu-release-upgrader (Ubuntu):
status: Confirmed → Triaged
assignee: nobody → Nick Rosbrook (enr0n)
importance: Undecided → Medium
Changed in ubuntu-release-upgrader (Ubuntu Lunar):
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Nick Rosbrook (enr0n)
Revision history for this message
Juerg Haefliger (juergh) wrote :

So we basically have two scenarios that we need to cover:
1) upgrade of a dkms package
2) upgrade of a kernel

For #1: We are supposed to fix all dkms packages in the archive to either a) compile properly for the new kernel or b) flag it as unsupported so that dkms build failures are ignored. If a dkms package upgrade results in a build failure, that's a legit failure unless it fails for an unsupported kernel.

If a dkms package is upgraded, first the current dkms module build is completely removed and later the upgraded dkms is rebuilt for all installed kernels (unless it was inactive for some installed kernels or /etc/dkms/no-autoinstall is present).

For #2: A kernel upgrade results in a rebuild of all installed (and active) dkms modules which can fail if they are unsupported or are old Ubuntu packages that haven't been purged. IMO this should not result in a kernel upgrade failure but it might be too complicated to detect this condition.

If a new kernel is installed, dkms autoinstall is called which rebuilds all installed dkms modules for the new kernel (unless /etc/dkms/no-autoinstall is present).

What kills us currently is #2 in combination with old and/or unsupported DKMS packages. Using the /etc/dkms/no-autoinstall mechanism to work around that is problematic as well because it skips the rebuild of upgraded dkms modules and we end up with an empty output of 'dkms status', i.e, no dkms builds at all.

ATM it's unclear to me if there's an ordering issue as well in case both the kernel and some dkms modules are upgraded (which upgrade happens first?). I think what we want is to always rebuild upgraded dkms modules (so we can't use no-autoinstall) but somehow don't fail a kernel upgrade in case of a resulting dkms build failure. Maybe downgrade that condition to a warning rather than an upgrade failure? It really is a dkms issue, not a kernel issue...

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

In ubuntu-release-upgrader there are quirks available to purge remove packages upon dist-upgrade. When dkms modules are removed from the archive, or like marked as broken. We should add them in release upgrader as quirks to be purged from system prior to upgrade starting.

That's definitely something that we can do today and improve the situation.

Revision history for this message
Juerg Haefliger (juergh) wrote :

We have bug 2028366 that 'fixes' this by ignoring DKMS build failures during release upgrades. I have to check but it might be as easy as running 'dkms status' after release upgrade finished and notify the user about DKMSes that are not built for the new kernel.

Nick Rosbrook (enr0n)
tags: removed: foundations-todo
Revision history for this message
Brian Murray (brian-murray) wrote :

Ubuntu 23.04 (Lunar Lobster) has reached end of life, so this bug will not be fixed for that specific release.

Changed in ubuntu-release-upgrader (Ubuntu Lunar):
status: Triaged → Won't Fix
Nick Rosbrook (enr0n)
Changed in ubuntu-release-upgrader (Ubuntu):
assignee: Nick Rosbrook (enr0n) → nobody
Changed in ubuntu-release-upgrader (Ubuntu Lunar):
assignee: Nick Rosbrook (enr0n) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.