[nvidia] GDM no longer presents GUI login Ubuntu 18.04

Bug #1777378 reported by William S Gregory on 2018-06-18
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-390 (Ubuntu)
Undecided
Unassigned

Bug Description

Background:

Did a clean install of Ubuntu 18.04, standard Gnome desktop. Used nVidia proprietary drivers from Ubuntu repo. Worked quite well for over two weeks. When booting up on Friday 15th June, 2018 no longer had a GUI login screen. There was just a few lines in tty1. Can still login to other terminals and through SSH.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: gdm3 3.28.2-0ubuntu1.2
ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18
Uname: Linux 4.15.0-23-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia
ApportVersion: 2.20.9-0ubuntu7.2
Architecture: amd64
Date: Sun Jun 17 22:53:22 2018
InstallationDate: Installed on 2018-05-20 (28 days ago)
InstallationMedia: Ubuntu 18.04 LTS "Bionic Beaver" - Release amd64 (20180426)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: gdm3
UpgradeStatus: No upgrade log present (probably fresh install)
mtime.conffile..etc.gdm3.custom.conf: 2018-05-31T00:51:08.204556
---
ProblemType: Bug
ApportVersion: 2.20.9-0ubuntu7.2
Architecture: amd64
DistroRelease: Ubuntu 18.04
InstallationDate: Installed on 2018-05-20 (28 days ago)
InstallationMedia: Ubuntu 18.04 LTS "Bionic Beaver" - Release amd64 (20180426)
NonfreeKernelModules: nvidia_modeset nvidia
Package: gdm3 3.28.2-0ubuntu1.2
PackageArchitecture: amd64
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18
Tags: bionic
Uname: Linux 4.15.0-23-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
_MarkForUpload: True
mtime.conffile..etc.gdm3.custom.conf: 2018-06-18T00:27:55.234654
---
ProblemType: Bug
ApportVersion: 2.20.9-0ubuntu7.2
Architecture: amd64
DistroRelease: Ubuntu 18.04
InstallationDate: Installed on 2018-05-20 (28 days ago)
InstallationMedia: Ubuntu 18.04 LTS "Bionic Beaver" - Release amd64 (20180426)
NonfreeKernelModules: nvidia_modeset nvidia
Package: mutter
PackageArchitecture: amd64
ProcEnviron:
 LANG=en_US.UTF-8
 TERM=xterm-256color
 PATH=(custom, no user)
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18
Tags: bionic
Uname: Linux 4.15.0-23-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
mtime.conffile..etc.gdm3.custom.conf: 2018-06-18T00:27:55.234654

William S Gregory (0c-bill) wrote :
Daniel van Vugt (vanvugt) wrote :

Great. Next please attach output from these commands (run on the affected machine):
  dmesg
and
  journalctl -b

You can save their output by running:
  dmesg > dmesg.txt
  journalctl -b > journal.txt
and then copy the files to/from another machine via 'scp'.

Please also let us know if the machine has any files in /var/crash/ and what they are.

Changed in gdm3 (Ubuntu):
status: New → Incomplete
Daniel van Vugt (vanvugt) wrote :

Please also run this command on the affected machine:

  apport-collect 1777378

William S Gregory (0c-bill) wrote :

Daniel, here is the output of dmesg:

Daniel van Vugt (vanvugt) wrote :

Also, your change to /etc/gdm3/custom.conf is likely to make it incompatible with the Nvidia driver:

  WaylandEnable=true

Which would explain the lack of GUI.

Please try undoing that change and reboot.

William S Gregory (0c-bill) wrote :

Daniel here is the output of journalctl.

William S Gregory (0c-bill) wrote :

Daniel, I am not sure what to do with apport-collect 1777378 from #3, since it is waiting in vein for a web-browser to open. This won't happen since I have no GUI desktop at present.

I did try the fix from #5 two different ways:

1> Commented out "WaylandEnable=true" with a leading #.
   -This yielded the same problem. A boot to a seemingly hung tty1. Can still use other terminals and SSH normally.

2> Changed line in question to read "WaylandEnable=false".
   -This boots to no useable terminals. Have to SSH in to do anything.

Daniel van Vugt (vanvugt) wrote :

If you've been given a web URL to open you can do that part on a different machine. Alternatively maybe try this command:

  apport-cli --update-bug=1777378

apport information

tags: added: apport-collected
description: updated

apport information

Another thought is that there was a gdm3 update on 12 June that might have taken a couple of days to reach you. If that has caused the problem then it would be a good idea to try downloading and reinstalling the prior version's files from here:

  https://launchpad.net/ubuntu/+source/gdm3/3.28.0-0ubuntu1/+build/14516225

Download the .deb files from there, then install them manually by running:

  dpkg -i *.deb

Daniel van Vugt (vanvugt) wrote :

Please also attach the machine's /var/log/apt/history.log
which will show what changed and when.

Changed in mutter (Ubuntu):
status: New → Incomplete
William S Gregory (0c-bill) wrote :

Here is the history.log for #13

William S Gregory (0c-bill) wrote :

Manually installed files specified in #12. There was one missing package, "libglib2.0-dev". Installed it, then ran "dpkg -i *.deb" again. Rebooted. No apparent change on first reboot. No apparent change after second reboot.

Note: The debug, or ddeb files, were not installed.

description: updated

apport information

apport information

Noticed previously that the status of my working Gnome (Ubuntu 16.04) desktop's GDM3 service is a bit different than the bugged machine. Attached is a series of systemctl status updates before and after stopping and starting the GDM3 service. Note that I did not run any other systemctl command against GDM3 since the last reboot for this output.

There seems to be differences in opening and closing sessions involving PAM. My working system starts a PAM session and leaves it running. In the non-working system it appears that GDM ends the session shortly after opening it, and some child processes have died right away as well.

Daniel van Vugt (vanvugt) wrote :

Thanks.

Comment #14 indeed shows you ran into a lot of trouble on 2018-06-15 and tried a lot of things to fix the nvidia drivers. The only thing is that the nvidia drivers have never changed at all since their release in April:

  https://launchpad.net/ubuntu/+source/nvidia-graphics-drivers-390

So Nvidia updates can't be the issue.

Something that did happen just before the problem began was a kernel upgrade:

  https://launchpad.net/ubuntu/+source/linux/4.15.0-23.25

So that will have necessitated a rebuild of the nvidia kernel driver, which might have failed and would explain the issue. So...

  1. Can you find any log or text files under /var/log/dkms ?
  2. Can you please run 'lspci -k' on the machine and send us the output?

William S Gregory (0c-bill) wrote :

The output of "lspci -k" is attached.

Unfortunately, there doesn't appear to be anything related to dkms in /var/log .

Daniel van Vugt (vanvugt) wrote :

Sorry, I can't remember the directory exactly. It should be under something like that. Maybe in /var/dkms/ ... ?

William S Gregory (0c-bill) wrote :

There seems to be several places to look. I will have to continue later as I have run out of time for now. But maybe the attached file will help narrow down the location of any useful files related to dkms.

Daniel van Vugt (vanvugt) wrote :

Yes, please send /var/lib/dkms/nvidia/390.67/4.15.0-23-generic/x86_64/log/make.log

Also a copy of /proc/cmdline

Daniel van Vugt (vanvugt) wrote :
Download full text (3.7 KiB)

However maybe the make.log won't help. Because you appear to have a working kernel driver.

I can also see this failure, which is important:

Jun 17 21:09:53 little-black-box gnome-shell[1367]: Unable to initialize Clutter: Unable to initialize the Clutter backend: no available drivers found.
Jun 17 21:09:53 little-black-box gnome-shell[1367]: Unable to initialize Clutter.

which would/should be related to:

Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (II) NVIDIA(0): Creating default Display subsection in Screen section
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: "Default Screen Section" for depth/fbbpp 24/32
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (==) NVIDIA(0): Depth 24, (==) framebuffer bpp 32
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (==) NVIDIA(0): RGB weight 888
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (==) NVIDIA(0): Default visual is TrueColor
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (II) Applying OutputClass "nvidia" options to /dev/dri/card0
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (**) NVIDIA(0): Option "AllowEmptyInitialConfiguration"
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (**) NVIDIA(0): Enabling 2D acceleration
Jun 17 21:09:53 little-black-box systemd[1]: Starting NVIDIA Persistence Daemon...
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (II) NVIDIA(0): NVIDIA GPU GeForce GTX 770 (GK104) at PCI:1:0:0 (GPU-0)
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (--) NVIDIA(0): Memory: 4194304 kBytes
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (--) NVIDIA(0): VideoBIOS: 80.04.c3.00.72
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (II) NVIDIA(0): Detected PCI Express Link width: 16X
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (II) NVIDIA(0): Validated MetaModes:
Jun 17 21:09:53 little-black-box nvidia-persistenced[1439]: Verbose syslog connection opened
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (II) NVIDIA(0): "NULL"
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (II) NVIDIA(0): Virtual screen size determined to be 640 x 480
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (WW) NVIDIA(0): Unable to get display device for DPI computation.
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (==) NVIDIA(0): DPI set to (75, 75); computed from built-in default

Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (II) NVIDIA(0): Setting mode "NULL"
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (==) NVIDIA(0): Disabling shared memory pixmaps
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (==) NVIDIA(0): Backing store enabled
Jun 17 21:09:53 little-black-box /usr/lib/gdm3/gdm-x-session[1434]: (==) NVIDIA(0): Silken mouse enabled
Jun 17 21:09:53 little-bla...

Read more...

tags: added: nvidia
summary: - GDM no longer presents GUI login Ubuntu 18.04
+ [nvidia] GDM no longer presents GUI login Ubuntu 18.04
Daniel van Vugt (vanvugt) wrote :

I wonder if the kernel upgrade on 2018-06-13 broke your kernel command line, which would prevent the nvidia driver from working...

Please add "nomodeset" to the lines:

  GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
  GRUB_CMDLINE_LINUX=""

of /etc/default/grub and then run:

  sudo update-grub

and reboot.

William S Gregory (0c-bill) wrote :

Answer to #24- make.log is attached.

/proc/cmdline is -

BOOT_IMAGE=/vmlinuz-4.15.0-23-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=1

William S Gregory (0c-bill) wrote :

Added nomodeset to the lines specified in #26. Ran "update-grub". Rebooted. Unfortunately, still no GDM login screen.

Daniel van Vugt (vanvugt) wrote :

Please let us know if the machine has any files in /var/crash/ and what they are.

Please also copy /var/lib/whoopsie/whoopsie-id from the affected machine and paste its contents in place of ID in https://errors.ubuntu.com/user/ID on another machine where you can browse the web. If that page shows some links then please share them with us here.

William S Gregory (0c-bill) wrote :

In reference to #29. Attached is the contents of /var/crash.

William S Gregory (0c-bill) wrote :

In reference to #29 regarding whoopsie:

The ID is:
19b6b20813c4f5b05681f9895a0eea0b33c549b42c2877003a447f1dc2e2d71a1c819b59ac63fb1977d412a9b4c0adbc72585129337191e89a05d246c7bf0bb0

The links are:
https://errors.ubuntu.com/oops/d69e209a-70f2-11e8-b67b-fa163ef911dc
https://errors.ubuntu.com/oops/af8ea9d2-70ef-11e8-9d0b-fa163e192766

Daniel van Vugt (vanvugt) wrote :

It appears your crash reports are not being sent because the nvidia driver you're using is not the official Ubuntu one. Please remove your nvidia driver and install the official one from:

  sudo apt install nvidia-driver-390

When done, reboot. Assuming that doesn't fix the problem, please check again for new crash files. If you find any, please DON'T send them to us but instead run this command to do it:

  ubuntu-bug YOURFILE.crash

and tell us the new bug ID created.

William S Gregory (0c-bill) wrote :

As per #32:
1 Purged all nvidia packages.
2 apt autoremoved, autocleaned.
3 Removed ppa:graphics-drivers (added for a previous attempt to fix this bug)
4 Apt update
5 Apt Installed nvidia-driver-390
6 Rebooted machine

Still no graphical login. ALso, strangely, no new or updated files in /var/crash.

Daniel van Vugt (vanvugt) wrote :

Thanks for doing all that.

If there are no crash files then I wonder is everything actually running despite not being visible?

1. From ssh, please run:
     ps auxw | grep gdm > gdms.txt
   and send us the output.

2. Please also send the output of:
     dpkg -l > dpkgl.txt

3. How many monitors (and what resolution) are connected?

4. Now you've replaced the nvidia driver please again run:
     journalctl -b > newjournal.txt
   and send the result.

William S Gregory (0c-bill) wrote :

For 1 from #34.

William S Gregory (0c-bill) wrote :

For 2 from #34.

William S Gregory (0c-bill) wrote :

The machine in question has a single monitor with a resolution of 1920x1080.

And newjournal.txt for 4 from #34.

Daniel van Vugt (vanvugt) wrote :

It appears the login screen processes are all running fine (gdms.txt).

The only remaining problem I can see is that the nvidia driver can't find your monitor:

Jun 19 21:46:18 little-black-box /usr/lib/gdm3/gdm-x-session[1441]: (II) NVIDIA(0): Validated MetaModes:
Jun 19 21:46:18 little-black-box /usr/lib/gdm3/gdm-x-session[1441]: (II) NVIDIA(0): "NULL"
Jun 19 21:46:18 little-black-box /usr/lib/gdm3/gdm-x-session[1441]: (II) NVIDIA(0): Virtual screen size determined to be 640 x 480
Jun 19 21:46:18 little-black-box /usr/lib/gdm3/gdm-x-session[1441]: (WW) NVIDIA(0): Unable to get display device for DPI computation.
Jun 19 21:46:18 little-black-box /usr/lib/gdm3/gdm-x-session[1441]: (==) NVIDIA(0): DPI set to (75, 75); computed from built-in default

Please attach a copy of this file if you can find one of them:
  /etc/X11/xorg.conf
  /etc/xorg.conf
  /usr/etc/X11/xorg.conf
  /usr/lib/X11/xorg.conf

Please also try unplugging/replugging the monitor, or replacing the monitor cable. And if you have a different monitor then please also try that.

William S Gregory (0c-bill) wrote :

Just did a locate for all xorg.conf:

sudo locate -i xorg.conf
/usr/share/X11/xorg.conf.d
/usr/share/X11/xorg.conf.d/10-amdgpu.conf
/usr/share/X11/xorg.conf.d/10-nvidia.conf
/usr/share/X11/xorg.conf.d/10-quirks.conf
/usr/share/X11/xorg.conf.d/10-radeon.conf
/usr/share/X11/xorg.conf.d/40-libinput.conf
/usr/share/X11/xorg.conf.d/70-wacom.conf
/usr/share/doc/xserver-xorg-video-intel/xorg.conf
/usr/share/man/man5/xorg.conf.5.gz
/usr/share/man/man5/xorg.conf.d.5.gz

The only file named xorg.conf is attached.

Daniel van Vugt (vanvugt) wrote :

Thanks for searching.

It appears this is now a problem in the nvidia driver only, if not a hardware/cable problem. So reassigning...

affects: gdm3 (Ubuntu) → nvidia-graphics-drivers-390 (Ubuntu)
no longer affects: mutter (Ubuntu)
Daniel van Vugt (vanvugt) wrote :

Other than the aforementioned hardware experimentation, you might also want to try downgrading the kernel. Older kernels are likely still installed on your machine so all you need is to get to the grub menu (which can be difficult) and choose an older kernel at boot time.

See: https://askubuntu.com/questions/16042/how-to-get-to-the-grub-menu-at-boot-time

Daniel van Vugt (vanvugt) wrote :

Please also attach: /usr/share/X11/xorg.conf.d/10-nvidia.conf
in case it is relevant.

Daniel van Vugt (vanvugt) wrote :

Your logs also show you have "nvidia-modeset" being loaded still. So, now that you have installed the official driver, maybe also try undoing the changes from comment #26.

William S Gregory (0c-bill) wrote :

For #42, the contents of 10-nvidia.conf:

Section "OutputClass"
    Identifier "nvidia"
    MatchDriver "nvidia-drm"
    Driver "nvidia"
    Option "AllowEmptyInitialConfiguration"
    ModulePath "/usr/lib/x86_64-linux-gnu/nvidia/xorg"
EndSection

William S Gregory (0c-bill) wrote :

For #41 and #43; tried rebooting into the previous kernel 4.0.15.22. Both with "nomodeset" and without it.

With "nomodeset" the GDM login presents and I can log in! But it doesn't appear to be using the actual nvidia driver... Only resolution available is 1027x768 (not the native resolution of the monitor.) And nvidia settings does not show any of the usaul tabs/functions.

Without "nomodeset" the GDM does not display.

William S Gregory (0c-bill) wrote :

I believe I have found the culprit. Not happy to say that it looks like it was a configuration change I made to try to fix the ubiquitous screen-tearing and frame synch problems that some applications have. I *believed* I had reverted all the changes I made when I saw they hadn't fixed the issue, apparently I missed one. Oddly, the machine had booted up into a proper GDM session several times since that attempt. In any case, I went back over my research and double-checked anything that we hadn't already covered and found an article about adding a zz-nvidia-modeset.conf file in the modprobe.d folder. This had one working line in it:

"options nvidia_drm modeset=1"

Once commented that line out and updated initramfs the machine was able to make all the way into the full GDM session again.

I am very sorry to wasted your time on this problem that was obviously due to an oversight on my part.

Changed in nvidia-graphics-drivers-390 (Ubuntu):
status: Incomplete → Invalid
Daniel van Vugt (vanvugt) wrote :

No problem. It's not uncommon for bugs to end like this.

tags: added: nvidia-drm.modeset
To post a comment you must log in.