Xorg does not detect displays in rootless mode on nvidia proprietary drivers (GNOME)

Bug #1672033 reported by Igor on 2017-03-11
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
NVIDIA Drivers Ubuntu
Undecided
Unassigned
mutter (Ubuntu)
Undecided
Unassigned
xorg (Ubuntu)
Undecided
Unassigned

Bug Description

There are two bug reports in LP: #1559576, LP: #1632322 and also LP: #1666664, where GDM does not start on proprietary nvidia drivers. As it turned out, the reason for that was Xorg starting in rootless mode and apparently not initializing everything properly, which was causing gnome-shell/libmutter to crash.

Installing xserver-xorg-legacy did partially fix those issues.

Enabling modesetting for nvidia driver however still causes the problem.

Here are some parts from log:
Xorg startup:
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (--) Log file renamed from "/var/lib/gdm3/.local/share/xorg/Xorg.pid-2027.log" to "/var/lib/gdm3/.local/share/xorg/Xorg.0.log"
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: X.Org X Server 1.18.4
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: Release Date: 2016-07-19
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: X Protocol Version 11, Revision 0
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: Build Operating System: Linux 4.4.0-53-generic x86_64 Ubuntu

glx loaded:
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) LoadModule: "glx"
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) Loading /usr/lib/x86_64-linux-gnu/xorg/extra-modules/libglx.so
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) Module glx: vendor="NVIDIA Corporation"
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: compiled for 4.0.2, module version = 1.0.0
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: Module class: X.Org Server Extension
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) NVIDIA GLX Module 375.39 Tue Jan 31 19:37:12 PST 2017

nvidia loaded:
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) LoadModule: "nvidia"
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) Loading /usr/lib/x86_64-linux-gnu/xorg/extra-modules/nvidia_drv.so
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) Module nvidia: vendor="NVIDIA Corporation"
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: compiled for 4.0.2, module version = 1.0.0
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: Module class: X.Org Video Driver

modesetting loaded:
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) LoadModule: "modesetting"
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) Loading /usr/lib/xorg/modules/drivers/modesetting_drv.so
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) Module modesetting: vendor="X.Org Foundation"
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: compiled for 1.18.4, module version = 1.18.4
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: Module class: X.Org Video Driver
Mär 11 00:43:18 arvlin /usr/lib/gdm3/gdm-x-session[2025]: ABI class: X.Org Video Driver, version 20.0

gnome-shell fails to run:
Mär 11 00:43:20 arvlin kernel: gnome-shell[2067]: segfault at 28 ip 00007fedba8da7c4 sp 00007ffd2fb5f5a0 error 4 in libmutter-0.so.0.0.0[7fedba893000+12f000]

xorg stops:
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) UnloadModule: "libinput"
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) systemd-logind: releasing fd for 13:66
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) UnloadModule: "libinput"
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) systemd-logind: releasing fd for 13:67
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) UnloadModule: "libinput"
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) systemd-logind: releasing fd for 13:64
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) UnloadModule: "libinput"
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) systemd-logind: releasing fd for 13:65
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) NVIDIA(GPU-0): Deleting GPU-0
Mär 11 00:43:21 arvlin /usr/lib/gdm3/gdm-x-session[2025]: (II) Server terminated successfully (0). Closing log file.
Mär 11 00:43:21 arvlin gdm-launch-environment][2009]: pam_unix(gdm-launch-environment:session): session closed for user gdm

Jeremy Bicha (jbicha) on 2017-03-11
summary: XOrg does not work in rootless mode on nvidia proprietary drivers
+ (GNOME)
Igor (invy) on 2017-03-11
summary: - XOrg does not work in rootless mode on nvidia proprietary drivers
+ Xorg does not work in rootless mode on nvidia proprietary drivers
(GNOME)

I don't currently have access to my nvidia hardware and to be honest not even sure if rootless Xorg is supported yet on nvidia drivers, however I would have assumed that if wayland is mostly working then the KMS support would have been in good enough shape for this to work.

Without xserver-xorg-legacy gdm should start in wayland mode, so not sure how that is related to rootless Xorg?

Igor, Can you get a back trace of the mutter crash? As far as I recall most of the rootless Xorg stuff is at a lower level than mutter, and handled by logind and Xorg. systemd-logind passes a fd for your drm device over to Xorg, are there any log messages related to that?

Igor (invy) wrote :

Tim,

indeed, if KMS is enabled, gdm starts perfectly in wayland mode and you can use gnome-shell wayland session (glx is however broken). But if you would like to use gnome-shell Xorg session, then gnome-shell will crash, because apparently gdm starts xorg in rootless mode (I presume uid of user being logged in, which makes logically sense).

This is exactly the problem: enabling KMS by default would break gnome-shell xorg session.

Regrading logs. Could you maybe give me some hints for what should I look, because nothing looks suspicious.

It's pretty clear, that xorg is lacking some permissions for some resources, but the question is, which exactly.

Igor (invy) wrote :
Download full text (3.9 KiB)

Here is another observation (my old workaround):

- KMS is enabled, gdm starts in wayland mode. Trying to start gnome-shell in Xorg mode fails (gnome-shell/mutter crash).

- Switch to tty (ctrl+alt+f2), login and start:
- $ sudo lightdm --test-mode
 - lightdm is starting, nvidia logo appears for a moment
- Switch back to tty once again and kill lightdm (ctrl+c)
- Switch back to GDM (ctrl+alt+f1)
- Login in gnome-shell Xorg session: everything works fine at this moment.

The question is what lightdm does, that gdm doesn't?

New messages in logs during xorg startup after executing lightdm are:
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): Valid display device(s) on GPU-0 at PCI:3:0:0
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-0 (boot)
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-1
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-2
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-3
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-4
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-5
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-6
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(0): DFP-7

/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DELL U2412M (DFP-0): connected
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DELL U2412M (DFP-0): Internal TMDS
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DELL U2412M (DFP-0): 330.0 MHz maximum pixel clock
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0):
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-1: disconnected
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-1: Internal TMDS
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-1: 165.0 MHz maximum pixel clock
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0):
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-2: disconnected
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-2: Internal DisplayPort
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-2: 1440.0 MHz maximum pixel clock
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0):
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-3: disconnected
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-3: Internal TMDS
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-3: 165.0 MHz maximum pixel clock
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0):
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-4: disconnected
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-4: Internal DisplayPort
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-4: 1440.0 MHz maximum pixel clock
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0):
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-5: disconnected
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-5: Internal TMDS
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-5: 165.0 MHz maximum pixel clock
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0):
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-6: disconnected
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-6: Internal DisplayPort
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-6: 1440.0 MHz maximum pixel clock
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0):
/usr/lib/gdm3/gdm-x-session: (--) NVIDIA(GPU-0): DFP-7: disconnected
/usr/lib/gd...

Read more...

Igor (invy) wrote :

A little bit of debugging and investigating confirm what is in log above:

libmutter cannot get a monitor and crashes here:
https://github.com/GNOME/mutter/blob/master/src/backends/meta-backend.c#L128-L133

  primary =
    meta_monitor_manager_get_primary_logical_monitor (monitor_manager);

  meta_backend_warp_pointer (backend,
                             primary->rect.x + primary->rect.width / 2,
                             primary->rect.y + primary->rect.height / 2);

because 'primary' is not a valid pointer.

So,
1. This should be reported to gnome/mutter developers, so they check all their pointers and terminate in clean way with meta_fatal("failed to get primary monitor"); or something like that.
2. We have to understand, why libmutter fails to get primary logical monitor. Does Xorg need some permissions?

Igor (invy) on 2017-03-19
summary: - Xorg does not work in rootless mode on nvidia proprietary drivers
- (GNOME)
+ Xorg does not detect displays in rootless mode on nvidia proprietary
+ drivers (GNOME)
Tim (darkxst) wrote :

what hardware are you running? I recall some hybrid laptops fail to advertise a "primary" display. what is the output of "xrandr -q" in a working root-less X session?

Tim (darkxst) wrote :

Also does lightdm work with KMS enabled for logging into gnome-shell session?

Igor (invy) wrote :

I don't have hybrid graphics. It's a normal desktop.

Yes, lightdm is working with KMS enabled and it is possible to log in into gnome-shell Xorg session.

But lightdm is started as root:
root 10867 0.0 0.0 361724 6620 ? SLsl 15:41 0:00 /usr/sbin/lightdm

Igor (invy) wrote :

xrandr output in rootless session:

igor:~% ps aux | grep Xorg
igor 3779 3.2 0.0 199348 51332 tty3 S+ 15:49 0:00 /usr/lib/xorg/Xorg vt3 -displayfd 3 -auth /run/user/1000/gdm/Xauthority -background none -noreset -keeptty -verbose 3

igor:~% xrandr -q
Screen 0: minimum 8 x 8, current 1920 x 1200, maximum 32767 x 32767
DVI-D-0 connected primary 1920x1200+0+0 (normal left inverted right x axis y axis) 518mm x 324mm
   1920x1200 59.95*+
   1920x1080 60.00
   1680x1050 59.95
   1600x1200 60.00
   1280x1024 60.02
   1280x960 60.00
   1024x768 60.00
   800x600 60.32
   640x480 59.94
HDMI-0 disconnected (normal left inverted right x axis y axis)
DP-0 disconnected (normal left inverted right x axis y axis)
DP-1 disconnected (normal left inverted right x axis y axis)
DP-2 disconnected (normal left inverted right x axis y axis)
DP-3 disconnected (normal left inverted right x axis y axis)
DP-4 disconnected (normal left inverted right x axis y axis)
DP-5 disconnected (normal left inverted right x axis y axis)

lightdm works, because it starts Xorg as root for itself, which causes monitor initialization and consequently later started rootless session work fine.

Tim (darkxst) wrote :

Igor, can you post full logs from Xorg, and also any systemd-logind logs from journalctl.

Tim (darkxst) wrote :

(before your workaround)

Igor (invy) wrote :

Here you go:

Igor (invy) wrote :
Igor (invy) wrote :
Tim (darkxst) wrote :

The following line is a bit suspicious
>Mär 21 20:10:20 arvlin /usr/lib/gdm3/gdm-x-session[3618]: (II) systemd-logind: releasing fd for 226:0

afair that would mean the Xorg/nvidia driver no longer has access to GPU once NVIDIA is trying to load/detect monitors, however strangely its also showing up in your other "working" log.

It might be worth filing an upstream bug against gdm.

Igor (invy) wrote :

Hm, this makes sense.

Maybe, after xorg/nvidia driver detects monitors as root (calling "sudo startx" from the console, has the same effect as running lightdm) they are cached somewhere, or maybe Xorg/nvidia driver changes permissions, so rootless Xorg cann then detect monitors in following sessions.

Igor (invy) wrote :

Jeremy, it's not related.

Also, this is how it supposed to work. If gdm thinks it cannot start wayland session, it wont display Wayland option.

Why does gdm thinks Wayland session is not available you ask? Because I presume you have modesetting disabled, because installation or update for nvidia driver overwrites:

/etc/modprobe.d/nvidia-graphics-drivers.conf -> /etc/alternatives/x86_64-linux-gnu_nvidia_modconf

and puts by default:

options nvidia_381_drm modeset=0

which will prevent wayland from working.

Tim (darkxst) wrote :

[ 163.975845] Call Trace:
[ 163.975854] dump_stack+0x63/0x81
[ 163.975858] __warn+0xcb/0xf0
[ 163.975860] warn_slowpath_null+0x1d/0x20
[ 163.975866] drm_atomic_helper_commit_hw_done+0xab/0xb0 [drm_kms_helper]
[ 163.975869] nvidia_drm_atomic_helper_commit_tail+0x128/0x1d0 [nvidia_drm]
[ 163.975875] commit_tail+0x3f/0x80 [drm_kms_helper]
[ 163.975879] commit_work+0x12/0x20 [drm_kms_helper]
[ 163.975881] process_one_work+0x1fc/0x4b0
[ 163.975883] worker_thread+0x4b/0x500
[ 163.975886] kthread+0x101/0x140
[ 163.975888] ? process_one_work+0x4b0/0x4b0
[ 163.975890] ? kthread_create_on_node+0x60/0x60
[ 163.975893] ret_from_fork+0x2c/0x40
[ 163.975895] ---[ end trace 341cfa538776d33d ]---

Jeremy Bicha (jbicha) on 2017-05-05
tags: added: gnome-1710 wayland
Jeremy Bicha (jbicha) on 2017-05-06
tags: added: gnome-17.10
removed: gnome-1710
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in mutter (Ubuntu):
status: New → Confirmed
Changed in xorg (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers