gnome-shell crashes when monitor turned off or switch kvm [meta_monitor_manager_get_logical_monitor_from_number: assertion '(unsigned int) number < g_list_length (manager->logical_monitors)' failed]

Bug #1734044 reported by Prasanth Kumar
48
This bug affects 7 people
Affects Status Importance Assigned to Milestone
GNOME Shell
Fix Released
Medium
gnome-shell (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Gnome shell will crash every time when I turn off monitor or switch the kvm to my other system.
Following message was captured in my journalctl log previously:

Nov 10 17:34:31 ubox gnome-shell[1235]: meta_monitor_manager_get_logical_monitor_from_number: assertion '(unsigned int) number < g_list_length (manager->logical_monitors)' failed
Nov 10 17:34:31 ubox gnome-shell[1235]: meta_workspace_get_work_area_for_monitor: assertion 'logical_monitor != NULL' failed

ProblemType: Bug
DistroRelease: Ubuntu 17.10
Package: gnome-shell 3.26.1-0ubuntu5
ProcVersionSignature: Ubuntu 4.13.0-17.20-generic 4.13.8
Uname: Linux 4.13.0-17-generic x86_64
ApportVersion: 2.20.7-0ubuntu3.5
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Wed Nov 22 23:10:24 2017
DisplayManager: gdm3
InstallationDate: Installed on 2017-10-27 (27 days ago)
InstallationMedia: Ubuntu 17.10 "Artful Aardvark" - Release amd64 (20171018)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: gnome-shell
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
In , Gw-t (gw-t) wrote :

Created attachment 361234
systemd journal excerpt including backtrace

I have a physical KVM switch that "unplugs" the monitor when switching devices (generally desired behavior for me). After upgrading mutter (3.24.4-1 -> 3.26.1-1) as packaged with ArchLinux, the following assertion is hit when switching screens on a machine that has only the KVM connected:

mutter:ERROR:backends/meta-monitor-manager.c:2267:meta_monitor_manager_get_logical_monitor_from_number: assertion failed: ((unsigned int) number < g_list_length (manager->logical_monitors))

A full excerpt from the systemd journal including a coredump is attached.

As a result, my Gnome session dies everytime I switch screens. The assertion is not hit on a laptop (with an internal screen) that is connected to the same KVM. Based on the assertion's text, this seems logical as it likely only hits when the only screen is "unplugged".

Thank you for your time on working on OpenSource software!

Revision history for this message
In , Jan Steffens (heftig) wrote :

Probably a duplicate of bug 788607.

Revision history for this message
In , Daniël de Kok (danieldk) wrote :

I hit the same assertion when I switch my screen off/on (Dell P2415Q HiDPI, NVIDIA Quadro 2000 using Nouveau, on Wayland). I have applied the diff of commit 6eb7d13 referenced in bug 788607. Unfortunately, this did not resolve the problem.

Revision history for this message
In , Daniël de Kok (danieldk) wrote :

Created attachment 361300
Backtrace

Revision history for this message
In , Gw-t (gw-t) wrote :

Tbe problem persists with the following ArchLinux packages:

gnome-shell 3.26.1+3+g43ec5280b-1
mutter 3.26.1+7+g41f7a5fdf-1

Based on the git commit in the package versions, I'd assume the fixes from bug 788607 would be included.

However, one overlap I have identified with the comments from bug 788607 is that there is no problem when there are no open windows.

Revision history for this message
In , Gw-t (gw-t) wrote :

Created attachment 361462
libmutter.so null pointer dereference backtrace and regs

Null pointer dereference that still occurs with the folliwng packages on ArchLinux:

gnome-shell 3.26.1+3+g43ec5280b-1
mutter 3.26.1+7+g41f7a5fdf-1

Revision history for this message
In , Jonas Ådahl (jadahl) wrote :

The trace looks strikingly similar indeed, but where from in Javascript it came from is harder to determine.

Any way you can attach a gdb to the process and reproduce? Then when you hit the assert, run:

print gjs_dumpstack()

and paste what is printed to stdout/stderr. (note that it might end up in the journal).

Revision history for this message
In , Gw-t (gw-t) wrote :

Created attachment 361499
gdb log with JS backtrace

I've added a SIGSEGV catchpoint to invoke gjs_dumpstack(), but the JS stacktrace is empty (also note that there is no JS calls in the GDB stacktrace for the null ptr dereference).

Oct 13 11:57:43 hostname org.gnome.Shell.desktop[5852]: == Stack trace for context 0x55c289d52000 ==
Oct 13 11:57:43 hostname org.gnome.Shell.desktop[5852]: (EE)
Oct 13 11:57:43 hostname org.gnome.Shell.desktop[5852]: Fatal server error:
Oct 13 11:57:43 hostname org.gnome.Shell.desktop[5852]: (EE) failed to read Wayland events: Broken pipe
Oct 13 11:57:43 hostname org.gnome.Shell.desktop[5852]: (EE)
Oct 13 11:58:25 hostname org.gnome.Shell.desktop[6236]: glamor: EGL version 1.4 (DRI2):

Revision history for this message
In , Daniël de Kok (danieldk) wrote :

Georg: your second trace (attachment 361462) looks more similar to mine in bug 788788. I will attach it here as well, it contains more symbols than your trace.

Revision history for this message
In , Daniël de Kok (danieldk) wrote :

Created attachment 361502
Trace of NULL pointer deref in meta_logical_monitor_get_scale

Revision history for this message
In , Gw-t (gw-t) wrote :
Revision history for this message
In , Daniël de Kok (danieldk) wrote :

The latest changes in the gnome-3-26 branch solve these problems for me. I still have to apply the workaround in bug 788788 though. With mutter from gnome-3-26 and that workaround I can reliably switch my screen off and on without gnome-shell/mutter crashing.

Revision history for this message
In , Jonas Ådahl (jadahl) wrote :

*** Bug 789040 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Jonas Ådahl (jadahl) wrote :

Created attachment 361654
window/wayland: Handle resizing when headless

We tried to get the geometry scale, which may depend on the main
logical monitor assigned to the window. To avoid dereferencing a NULL
logical monitor when headless, instead assume the geometry scale is 1.

Revision history for this message
In , Gw-t (gw-t) wrote :

Created attachment 361657
ArchLinux PKGBUILD patch to include Jonas' working patch

Thank you, your proposed patch works and solves the issue for me.

I've included the PKGBUILD patch that I used to test this locally for Jan's convenience.

Revision history for this message
In , Daniël de Kok (danieldk) wrote :

Thanks! I will try the patch as well tonight and report back.

Revision history for this message
In , Gw-t (gw-t) wrote :

Created attachment 361681
systemd journal of crashing after applying the latest patch

Zu frueh gefreut!

Seems there is another bug, this one triggered when switching the screen back on. And not a helpful backtrace. Doesn't always reproduce.

Revision history for this message
In , Gw-t (gw-t) wrote :

Created attachment 361682
backtrace of the last bug from gdb

Some more information to pinpoint the issue.

Revision history for this message
In , Jonas Ådahl (jadahl) wrote :

Could you install debug symbols, and get a full backtrace and attach that? Using gdb, that'd be the output of "backtrace full" on the core dump. FWIW, the new trace could be also be bug 788627 or bug 788908.

Revision history for this message
In , Daniël de Kok (danieldk) wrote :

For me it works now: I can reliably turn off and on the screen without gnome-shell crashing. This increases usability significantly :).

There are some glitches though. After switching on the screen again, gnome-terminal windows are 1/4th of the size. Nautilus windows retain the same size but are lo-DPI and zoomed. After switching back and forth to another desktop all the windows are normal again.

Revision history for this message
In , Anatol Pomozov (anatol) wrote :

Jonas, I applied your patch from https://bugzilla.gnome.org/show_bug.cgi?id=788764#c13 and it fixed the crash. Thanks a lot!

But I see other issue exactly the described by Daniel. I have a HiDPI monitor with scaling coefficient "2". After monitor resume terminal becomes scaled down to "1" and Nautilus fonts become blurred. I need to press "Super" button (show all windows at current desktop) to return to the correct state.

Revision history for this message
In , Gw-t (gw-t) wrote :

Created attachment 361801
another backtrace of the latest bug

Same bug with different backtrace this time.

I tried building mutter with:
CFLAGS="-flto=no -O0 -ggdb" ./configure --enable-debug …
But still didn't get any useful debug symbols.

Revision history for this message
In , Rui Matos (tiagomatos) wrote :

Review of attachment 361654:

lgtm

Revision history for this message
In , Jonas Ådahl (jadahl) wrote :

Comment on attachment 361654
window/wayland: Handle resizing when headless

Attachment 361654 pushed as 3572502 - window/wayland: Handle resizing when headless

Revision history for this message
In , Jonas Ådahl (jadahl) wrote :

Pushed to master. Marking the ARCH patch as "rejected" as I can't find any better patch status for it.

Revision history for this message
Prasanth Kumar (pak314) wrote :
Revision history for this message
Prasanth Kumar (pak314) wrote :
Revision history for this message
Prasanth Kumar (pak314) wrote :
Revision history for this message
Prasanth Kumar (pak314) wrote :
Revision history for this message
Prasanth Kumar (pak314) wrote :

My guess is that this maybe related to this fix?
https://bugzilla.gnome.org/show_bug.cgi?id=788764

description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in gnome-shell (Ubuntu):
status: New → Confirmed
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Sounds like bug 1721577 or bug 1724557. Someone please review those in detail because all three Launchpad bugs are pointing to the same upstream bug. Three in Launchpad sounds like too many.

Changed in gnome-shell (Ubuntu):
status: Confirmed → Incomplete
Changed in gnome-shell:
importance: Unknown → Medium
status: Unknown → Fix Released
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Also sounds like bug 1717170, which might have been closed too early.

summary: gnome-shell crashes when monitor turned off or switch kvm
+ [meta_monitor_manager_get_logical_monitor_from_number: assertion
+ '(unsigned int) number < g_list_length (manager->logical_monitors)'
+ failed]
Revision history for this message
pataluna (pantaluna) wrote :

This bug was reported for Ubuntu Desktop 17.10.

I would like to add that the problem when disconnecting the only monitor also occurs when using Ubuntu Desktop 18.04 LTS (latest updates Jul19,2018). gnome-shell will not always crash but the desktop becomes unusable when using x11vnc. The new report#1782573 is already marked as a duplicate.

https://bugs.launchpad.net/ubuntu/+source/gnome-shell/+bug/1782573

https://www.dropbox.com/s/a8jjnw14uhtibzk/ubuntu1804-x11vnc-disconnected-monitor-fatal.mp4?dl=0

tags: added: bionic
removed: artful
Revision history for this message
David White (cppege-david-9ei9ny) wrote :

Is anything happening on this issue? It's been a while ...

Revision history for this message
Prasanth Kumar (pak314) wrote :

Original submitter of bug replying. I don't see this crash in 18.10 for a while now. I'd be okay to close it.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

The fix for this bug was first released in mutter version 3.27.91-real. So yes Ubuntu 18.04 onward will have the fix.

Changed in gnome-shell:
importance: Medium → Unknown
status: Fix Released → Unknown
Changed in gnome-shell (Ubuntu):
status: Incomplete → Fix Released
Changed in gnome-shell:
importance: Unknown → Medium
status: Unknown → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.