Based on a discussion with ~albertomilone, powering down the NVIDIA GPU while keeping the modules loaded is the way to go long-term as opposed to blacklisting the modules.
My GPU is pre-Turing (Pascal, 1060m), however, powering off is not where the problem is.
Running `prime-select intel` creates /lib/udev/rules.d/80-pm-nvidia.rules which contains the following line to unbind an NVIDIA GPU device from its driver:
If I comment it out, I can boot just fine with my iGPU after running `prime-select intel`. The resulting 80-pm-nvidia.rules file looks like this: https://paste.ubuntu.com/p/HX6t9y8BPg/
Just commenting out the power management lines while leaving the unbinding in-place results in the same issue (80-pm-nvidia.rules: https://paste.ubuntu.com/p/mTdXbZZk8H/).
The unbinding operation hangs which results in something like this even before X11 or gdm3 are attempted to be started:
[ 15.683190] nvidia-uvm: Loaded the UVM driver, major device number 511.
[ 15.824882] NVRM: Attempting to remove minor device 0 with non-zero usage count!
[ 15.824903] ------------[ cut here ]------------
[ 15.825082] WARNING: CPU: 0 PID: 759 at /var/lib/dkms/nvidia/440.59/build/nvidia/nv-pci.c:577 nv_pci_remove+0x338/0x360 [nvidia]
# ...
[ 15.825330] ---[ end trace 353e142c2126a8a0 ]---
# ...
[ 242.649248] INFO: task nvidia-persiste:1876 blocked for more than 120 seconds.
[ 242.649931] Tainted: P W O 5.4.0-12-generic #15-Ubuntu
[ 242.650618] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 242.651319] nvidia-persiste D 0 1876 1 0x00000004
Eventually it fails with a timeout:
systemd[1]: nvidia-persistenced.service: start operation timed out. Terminating.
systemd[1]: nvidia-persistenced.service: Failed with result 'timeout'.
systemd[1]: Failed to start NVIDIA Persistence Daemon.
Masking nvidia-persistenced via `sudo systemctl mask nvidia-persistenced` and rebooting shows that systemd-udevd and rmmod hang as well:
Feb 9 17:18:43 blade systemd-udevd[717]: 0000:01:00.0: Worker [756] processing SEQNUM=4430 is taking a long time
Feb 9 17:18:43 blade systemd-udevd[717]: 0000:01:00.1: Worker [746] processing SEQNUM=4440 is taking a long time
Feb 9 17:20:43 blade systemd-udevd[717]: 0000:01:00.1: Worker [746] processing SEQNUM=4440 killed
Feb 9 17:20:43 blade systemd-udevd[717]: 0000:01:00.0: Worker [756] processing SEQNUM=4430 killed
Feb 9 17:21:31 blade kernel: [ 242.818665] INFO: task systemd-udevd:746 blocked for more than 120 seconds.
Feb 9 17:21:31 blade kernel: [ 242.819381] Tainted: P W O 5.4.0-12-generic #15-Ubuntu
Feb 9 17:21:31 blade kernel: [ 242.820075] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 9 17:21:31 blade kernel: [ 242.820797] systemd-udevd D 0 746 717 0x00000324
# ...
Feb 9 17:21:31 blade kernel: [ 242.823033] rmmod D 0 1939 1937 0x00004000
Feb 9 17:21:31 blade kernel: [ 242.823034] Call Trace:
# ...
Feb 9 17:21:31 blade kernel: [ 242.823783] nvkms_close_gpu+0x50/0x80 [nvidia_modeset]
Feb 9 17:21:31 blade kernel: [ 242.823793] _nv002598kms+0x14d/0x170 [nvidia_modeset]
# ...
Feb 9 17:21:31 blade kernel: [ 242.823893] ? nv_linux_drm_exit+0x9/0x768 [nvidia_drm]
Feb 9 17:21:31 blade kernel: [ 242.823897] ? __x64_sys_delete_module+0x147/0x290
# ...
Based on a discussion with ~albertomilone, powering down the NVIDIA GPU while keeping the modules loaded is the way to go long-term as opposed to blacklisting the modules.
The power management feature is described here (requires Turing GPUs and above): us.download. nvidia. com/XFree86/ Linux-x86_ 64/440. 44/README/ dynamicpowerman agement. html
http://
My GPU is pre-Turing (Pascal, 1060m), however, powering off is not where the problem is.
Running `prime-select intel` creates /lib/udev/ rules.d/ 80-pm-nvidia. rules which contains the following line to unbind an NVIDIA GPU device from its driver:
https:/ /github. com/tseliot/ nvidia- prime/blob/ cf757cc9585dfc0 32930379fc81eff b3a3d59606/ prime-select# L164-L165 =="0x10de" , ATTR{class} =="0x030000" , ATTR{remove}="1"
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}
If I comment it out, I can boot just fine with my iGPU after running `prime-select intel`. The resulting 80-pm-nvidia.rules file looks like this: https:/ /paste. ubuntu. com/p/HX6t9y8BP g/
Just commenting out the power management lines while leaving the unbinding in-place results in the same issue (80-pm- nvidia. rules: https:/ /paste. ubuntu. com/p/mTdXbZZk8 H/).
The unbinding operation hangs which results in something like this even before X11 or gdm3 are attempted to be started:
[ 15.683190] nvidia-uvm: Loaded the UVM driver, major device number 511. dkms/nvidia/ 440.59/ build/nvidia/ nv-pci. c:577 nv_pci_ remove+ 0x338/0x360 [nvidia] persiste: 1876 blocked for more than 120 seconds. kernel/ hung_task_ timeout_ secs" disables this message.
[ 15.824882] NVRM: Attempting to remove minor device 0 with non-zero usage count!
[ 15.824903] ------------[ cut here ]------------
[ 15.825082] WARNING: CPU: 0 PID: 759 at /var/lib/
# ...
[ 15.825330] ---[ end trace 353e142c2126a8a0 ]---
# ...
[ 242.649248] INFO: task nvidia-
[ 242.649931] Tainted: P W O 5.4.0-12-generic #15-Ubuntu
[ 242.650618] "echo 0 > /proc/sys/
[ 242.651319] nvidia-persiste D 0 1876 1 0x00000004
Eventually it fails with a timeout: persistenced. service: start operation timed out. Terminating. persistenced. service: Failed with result 'timeout'.
systemd[1]: nvidia-
systemd[1]: nvidia-
systemd[1]: Failed to start NVIDIA Persistence Daemon.
Masking nvidia-persistenced via `sudo systemctl mask nvidia- persistenced` and rebooting shows that systemd-udevd and rmmod hang as well:
Feb 9 17:18:43 blade systemd-udevd[717]: 0000:01:00.0: Worker [756] processing SEQNUM=4430 is taking a long time kernel/ hung_task_ timeout_ secs" disables this message. gpu+0x50/ 0x80 [nvidia_modeset] 0x14d/0x170 [nvidia_modeset] drm_exit+ 0x9/0x768 [nvidia_drm] delete_ module+ 0x147/0x290
Feb 9 17:18:43 blade systemd-udevd[717]: 0000:01:00.1: Worker [746] processing SEQNUM=4440 is taking a long time
Feb 9 17:20:43 blade systemd-udevd[717]: 0000:01:00.1: Worker [746] processing SEQNUM=4440 killed
Feb 9 17:20:43 blade systemd-udevd[717]: 0000:01:00.0: Worker [756] processing SEQNUM=4430 killed
Feb 9 17:21:31 blade kernel: [ 242.818665] INFO: task systemd-udevd:746 blocked for more than 120 seconds.
Feb 9 17:21:31 blade kernel: [ 242.819381] Tainted: P W O 5.4.0-12-generic #15-Ubuntu
Feb 9 17:21:31 blade kernel: [ 242.820075] "echo 0 > /proc/sys/
Feb 9 17:21:31 blade kernel: [ 242.820797] systemd-udevd D 0 746 717 0x00000324
# ...
Feb 9 17:21:31 blade kernel: [ 242.823033] rmmod D 0 1939 1937 0x00004000
Feb 9 17:21:31 blade kernel: [ 242.823034] Call Trace:
# ...
Feb 9 17:21:31 blade kernel: [ 242.823783] nvkms_close_
Feb 9 17:21:31 blade kernel: [ 242.823793] _nv002598kms+
# ...
Feb 9 17:21:31 blade kernel: [ 242.823893] ? nv_linux_
Feb 9 17:21:31 blade kernel: [ 242.823897] ? __x64_sys_
# ...