nvidia-graphics-drivers deferred DKMS feature does not work in xenial

Bug #1732206 reported by David Coronel on 2017-11-14
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
HWE Next
Undecided
Unassigned
nvidia-graphics-drivers-384 (Ubuntu)
High
Alberto Milone
Xenial
High
Alberto Milone
Zesty
High
Alberto Milone
Artful
High
Alberto Milone

Bug Description

SRU Request:

[Impact]
Deferred module build was a feature that worked with Upstart and that was never updated to work with systemd.

[Test Case]

1) Enter the following command to create the temporary file:
sudo touch /tmp/do_not_build_dkms_module

2) Enable the -proposed repository, and install the new "nvidia-384"

3) Check that DKMS didn't build the module (the "dkms status" will give you a clue).

3) Restart your computer and see if the module is built correctly (you should be able to access the desktop session). Check that the module is there with the "dkms status" command.

[Regression Potential]
Low, as, currently, the feature doesn't work at all with systemd.

_____________
The nvidia-graphics-drivers (https://github.com/tseliot/nvidia-graphics-drivers/tree/375-xenial) deferred DKMS feature does not work in xenial.

With upstart, doing a touch of the /tmp/do_not_build_dkms_module file indicates to defer building the kernel modules, then install the nvidia driver packages, which creates an upstart script to build the modules later. Upon reboot the upstart script would then do the DKMS build.

This no longer works in xenial. The code was never updated for the transition from upstart to systemd.

Changed in nvidia-graphics-drivers-384 (Ubuntu):
importance: Undecided → High
status: New → In Progress
Changed in nvidia-graphics-drivers-384 (Ubuntu Xenial):
status: New → In Progress
Changed in nvidia-graphics-drivers-384 (Ubuntu Zesty):
status: New → In Progress
Changed in nvidia-graphics-drivers-384 (Ubuntu Artful):
status: New → In Progress
Changed in nvidia-graphics-drivers-384 (Ubuntu Xenial):
importance: Undecided → High
Changed in nvidia-graphics-drivers-384 (Ubuntu Zesty):
importance: Undecided → High
Changed in nvidia-graphics-drivers-384 (Ubuntu Artful):
importance: Undecided → High
Changed in nvidia-graphics-drivers-384 (Ubuntu Xenial):
assignee: nobody → Alberto Milone (albertomilone)
Changed in nvidia-graphics-drivers-384 (Ubuntu Zesty):
assignee: nobody → Alberto Milone (albertomilone)
Changed in nvidia-graphics-drivers-384 (Ubuntu Artful):
assignee: nobody → Alberto Milone (albertomilone)
tags: added: argos originate-from-1698183
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nvidia-graphics-drivers-384 - 384.90-0ubuntu5

---------------
nvidia-graphics-drivers-384 (384.90-0ubuntu5) bionic; urgency=medium

  * debian/templates/nvidia-graphics-drivers.postinst.in:
    - Port deferred DKMS build to systemd (LP: #1732206).

 -- Alberto Milone <email address hidden> Wed, 15 Nov 2017 10:36:06 +0100

Changed in nvidia-graphics-drivers-384 (Ubuntu):
status: In Progress → Fix Released
description: updated
description: updated

Hello David, or anyone else affected,

Accepted nvidia-graphics-drivers-384 into artful-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nvidia-graphics-drivers-384/384.90-0ubuntu3.17.10.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-artful to verification-done-artful. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-artful. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in nvidia-graphics-drivers-384 (Ubuntu Artful):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-artful
Changed in nvidia-graphics-drivers-384 (Ubuntu Zesty):
status: In Progress → Fix Committed
tags: added: verification-needed-zesty
Łukasz Zemczak (sil2100) wrote :

Hello David, or anyone else affected,

Accepted nvidia-graphics-drivers-384 into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nvidia-graphics-drivers-384/384.90-0ubuntu0.17.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-zesty to verification-done-zesty. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-zesty. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in nvidia-graphics-drivers-384 (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed-xenial
Łukasz Zemczak (sil2100) wrote :

Hello David, or anyone else affected,

Accepted nvidia-graphics-drivers-384 into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nvidia-graphics-drivers-384/384.90-0ubuntu0.16.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

David Coronel (davecore) wrote :

I tested this in a VM using the default xenial cloud image (ie. "uvt-kvm create xenialoem release=xenial").

With the current nvidia-384 384.90-0ubuntu0.16.04.2 package, the nvidia dkms module doesn't get built after the reboot:

$ sudo apt update
$ sudo touch /tmp/do_not_build_dkms_module
$ sudo apt install nvidia-384
$ dkms status
bbswitch, 0.8, 4.4.0-103-generic, x86_64: installed
$ sudo reboot
$ dkms status
bbswitch, 0.8, 4.4.0-103-generic, x86_64: installed

With the new nvidia-384 384.90-0ubuntu0.16.04.3 package, the nvidia dkms module DOES get built after the reboot:

$ sudo apt update
$ sudo touch /tmp/do_not_build_dkms_module
$ sudo vi /etc/apt/sources.list # adding xenial-proposed
$ sudo apt update
$ sudo apt install nvidia-384
$ dkms status
bbswitch, 0.8, 4.4.0-103-generic, x86_64: installed
$ sudo reboot
$ dkms status
bbswitch, 0.8, 4.4.0-103-generic, x86_64: installed
nvidia-384, 384.90: added

ubuntu@xenialoem:~$ grep -i dkms /var/log/syslog
Dec 15 20:39:00 xenialoem sh[1057]: Creating symlink /var/lib/dkms/nvidia-384/384.90/source ->
Dec 15 20:39:00 xenialoem sh[1057]: DKMS: add completed.
Dec 15 20:39:00 xenialoem sh[1057]: * dkms: running auto installation service for kernel 4.4.0-103-generic
Dec 15 20:40:29 xenialoem sh[1057]: DKMS: build completed.
Dec 15 20:40:29 xenialoem sh[1057]: - Installing to /lib/modules/4.4.0-103-generic/updates/dkms/
Dec 15 20:40:29 xenialoem sh[1057]: - Installing to /lib/modules/4.4.0-103-generic/updates/dkms/
Dec 15 20:40:29 xenialoem sh[1057]: - Installing to /lib/modules/4.4.0-103-generic/updates/dkms/
Dec 15 20:40:29 xenialoem sh[1057]: - Installing to /lib/modules/4.4.0-103-generic/updates/dkms/
Dec 15 20:40:32 xenialoem sh[1057]: DKMS: install completed.

ubuntu@xenialoem:~$ last | grep boot
reboot system boot 4.4.0-103-generi Fri Dec 15 20:38 still running
reboot system boot 4.4.0-103-generi Fri Dec 15 20:12 still running

But I hear about a use case where a user might not be running the display-manager.service or oem-config.service on which this dkms deferred build systemd service depends on. I am waiting to get more details.

David Coronel (davecore) wrote :

My last comment #5 still stands. My tests show that this fix works for the cloud image of Ubuntu, but there are use cases where users need the nvidia drivers for GPU processing but are not running a display manager on the machine. In such a case, the display-manager.service systemd service will not exist and the deferred DKMS build will not happen because its Before attribute is set to "display-manager.service oem-config.service".

Here are the steps to see this (using the xenial cloud image :

$ sudo apt update
$ sudo touch /tmp/do_not_build_dkms_module
$ sudo vi /etc/apt/sources.list # adding xenial-proposed
$ sudo apt update
$ sudo apt install nvidia-384

ubuntu@xenialoem4:~$ cat /lib/systemd/system/nvidia-384.service
# Warning: This file is autogenerated by nvidia-384. All changes to this file will be lost.

[Unit]
Description=Detect the available GPUs and deal with any system changes
Before=display-manager.service oem-config.service

[Service]
Type=oneshot
ExecStart=/bin/sh -ec ' /usr/sbin/dkms add -m nvidia-384 -v 384.90; /usr/lib/dkms/dkms_autoinstaller start || ( rm -f /lib/systemd/system/nvidia-384.service && exit 1 ); /sbin/modprobe nvidia-384 || true ; /bin/systemctl disable nvidia-384.service || true ; rm -f /lib/systemd/system/nvidia-384.service'

[Install]
WantedBy=display-manager.service oem-config.service

==============

The oem-config service doesn't exist on the machine in that use case. The oem-config portion ran in a previous boot and the user now wants to install the nvidia driver. There is no display manager on that machine (ie. no gdm or lightdm) so the nvidia-384.service will never run.

I think this nvidia-384.service needs to be bound to a different service that will always run at the next boot.

David Coronel (davecore) 38 minutes ago
tags: added: verification-failed verification-failed-xenial
removed: verification-needed verification-needed-xenial
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers