Activity log for bug #1930359

Date Who What changed Old value New value Message
2021-06-01 05:27:48 Matthew Ruffell bug added bug
2021-06-01 05:27:57 Matthew Ruffell nominated for series Ubuntu Focal
2021-06-01 05:27:57 Matthew Ruffell bug task added mutter (Ubuntu Focal)
2021-06-01 05:28:03 Matthew Ruffell mutter (Ubuntu Focal): status New In Progress
2021-06-01 05:28:06 Matthew Ruffell mutter (Ubuntu Focal): importance Undecided High
2021-06-01 05:28:08 Matthew Ruffell mutter (Ubuntu Focal): assignee Matthew Ruffell (mruffell)
2021-06-01 05:28:30 Matthew Ruffell tags regression-update sts
2021-06-01 05:28:44 Matthew Ruffell description [Impact] gdm fails to start in a VMware Horizon VDI environment, with Nvidia GRID gpus passed into the VDIs. Downgrading mutter from 3.36.9-0ubuntu0.20.04.1 to 3.36.1-3ubuntu3 in -release fixes the issue, and the issue does not occur with 3.36.7+git20201123-0.20.04.1. Currently looking into what landed in bug 1919143 and bug 1905825. [Testcase] [Where problems can occur] [Impact] gdm fails to start in a VMware Horizon VDI environment, with Nvidia GRID gpus passed into the VDIs. Downgrading mutter from 3.36.9-0ubuntu0.20.04.1 to 3.36.1-3ubuntu3 in -release fixes the issue, and the issue does not occur with 3.36.7+git20201123-0.20.04.1. Currently looking into what landed in bug 1919143 and bug 1905825. [Testcase] [Where problems can occur]
2021-06-01 05:52:47 Daniel van Vugt tags regression-update sts focal regression-update sts
2021-06-01 05:53:40 Daniel van Vugt bug added subscriber Daniel van Vugt
2021-06-01 07:16:22 Dominique Poulain bug added subscriber Dominique Poulain
2021-06-02 17:16:35 Marco Trevisan (Treviño) bug added subscriber Marco Trevisan (Treviño)
2021-06-18 05:10:11 Matthew Ruffell attachment added Output of systemctl status and nvidia smi https://bugs.launchpad.net/ubuntu/+source/mutter/+bug/1930359/+attachment/5505411/+files/aws-working-g4dn-xlarge.txt
2021-06-18 05:10:51 Matthew Ruffell attachment added screenshot of working gdm on AWS https://bugs.launchpad.net/ubuntu/+source/mutter/+bug/1930359/+attachment/5505412/+files/Screenshot%20from%202021-06-18%2013-40-15.png
2021-06-18 13:30:09 Kai-Heng Feng bug watch added https://gitlab.gnome.org/GNOME/mutter/-/issues/1619
2021-06-18 13:30:19 Kai-Heng Feng bug added subscriber Kai-Heng Feng
2021-06-21 06:03:57 Daniel van Vugt summary gdm fails to start with latest mutter 3.36.9-0ubuntu0.20.04.1 in focal-updates gdm fails to start in a VMware Horizon VDI environment with latest mutter 3.36.9-0ubuntu0.20.04.1 in focal-updates
2021-06-29 23:04:14 Matthew Ruffell mutter (Ubuntu): status New Invalid
2021-06-29 23:04:20 Matthew Ruffell mutter (Ubuntu Focal): importance High Undecided
2021-06-29 23:04:23 Matthew Ruffell mutter (Ubuntu Focal): status In Progress Invalid
2021-07-01 05:48:03 Matthew Ruffell mutter (Ubuntu Focal): importance Undecided High
2021-07-01 05:48:06 Matthew Ruffell mutter (Ubuntu Focal): status Invalid In Progress
2021-07-01 05:48:23 Matthew Ruffell mutter (Ubuntu): status Invalid New
2021-07-02 06:24:22 Daniel van Vugt bug task added glib2.0 (Ubuntu)
2021-07-02 06:24:44 Daniel van Vugt glib2.0 (Ubuntu): assignee Matthew Ruffell (mruffell)
2021-07-02 06:24:54 Daniel van Vugt mutter (Ubuntu): assignee Matthew Ruffell (mruffell)
2021-07-02 06:25:02 Daniel van Vugt glib2.0 (Ubuntu Focal): assignee Matthew Ruffell (mruffell)
2021-07-12 04:58:10 Matthew Ruffell bug task deleted mutter (Ubuntu)
2021-07-12 04:58:17 Matthew Ruffell bug task deleted mutter (Ubuntu Focal)
2021-07-12 04:58:24 Matthew Ruffell glib2.0 (Ubuntu): status New Fix Released
2021-07-12 04:58:27 Matthew Ruffell glib2.0 (Ubuntu Focal): status New In Progress
2021-07-12 04:58:32 Matthew Ruffell glib2.0 (Ubuntu Focal): importance Undecided High
2021-07-12 04:58:47 Matthew Ruffell summary gdm fails to start in a VMware Horizon VDI environment with latest mutter 3.36.9-0ubuntu0.20.04.1 in focal-updates glib2.0: Uninitialised memory is written to gschema.compiled, failure to parse this file leads to gdm, gnome-shell failing to start
2021-07-12 04:59:06 Matthew Ruffell description [Impact] gdm fails to start in a VMware Horizon VDI environment, with Nvidia GRID gpus passed into the VDIs. Downgrading mutter from 3.36.9-0ubuntu0.20.04.1 to 3.36.1-3ubuntu3 in -release fixes the issue, and the issue does not occur with 3.36.7+git20201123-0.20.04.1. Currently looking into what landed in bug 1919143 and bug 1905825. [Testcase] [Where problems can occur] [Impact] A recent SRU of mutter 3.36.9-0ubuntu0.20.04.1 caused an outage for a user with 300 VDIs running Focal, where GNOME applications would fail to start, and if you reboot, gdm and gnome-shell both fail to start, and you are left with a black screen and a blinking cursor. After much investigation, mutter was not at fault. Instead, mutter-common calls the libglib2.0-0 hook on upgrade: Processing triggers for libglib2.0-0:amd64 (2.64.6-1~ubuntu20.04.3) ... This in turn calls glib-compile-schemas to recompile the gsettings gschema cache, from the files in /usr/share/glib-2.0/schemas/. The result is a binary gschemas.compiled file, which is loaded by libglib2.0 on every invocation of a GNOME application, or gdm or gnome-shell to fetch application default settings. Now, glib2.0 2.64.6-1~ubuntu20.04.3 in Focal has some non-deterministic behaviour when calling glib-compile-schemas, causing generated gschemas.compiled files to have differing contents on each run: # glib-compile-schemas /usr/share/glib-2.0/schemas # cmp -l /home/ubuntu/schemas/gschemas.compiled /usr/share/glib-2.0/schemas/gschemas.compiled | gawk '{printf "%08X %02X %02X\n", $1, strtonum(0$2), strtonum(0$3)}' 0000376F E3 D0 00003771 A4 DB # glib-compile-schemas /usr/share/glib-2.0/schemas # cmp -l /home/ubuntu/schemas/gschemas.compiled /usr/share/glib-2.0/schemas/gschemas.compiled | gawk '{printf "%08X %02X %02X\n", $1, strtonum(0$2), strtonum(0$3)}' 0000376F E3 C3 00003771 A4 98 # glib-compile-schemas /usr/share/glib-2.0/schemas # cmp -l /home/ubuntu/schemas/gschemas.compiled /usr/share/glib-2.0/schemas/gschemas.compiled | gawk '{printf "%08X %02X %02X\n", $1, strtonum(0$2), strtonum(0$3)}' 0000376F E3 68 00003771 A4 30 00003772 55 56 The bytes on the left are from a corrupted gschemas.compiled provided by an affected user. The changing bytes on the right are non-deterministic. I ran valgrind over glib-compile-schemas, and found that we are writing to uninitialised memory. https://paste.ubuntu.com/p/hvZccwdzxz/ What is happening is that a submodule of glib, gvdb, contains the logic for serialising the gschema data structures, and when it allocates a buffer to store the eventual gschemas.compiled file, it does not initialise it. When we populate the fields in the buffer, some bytes are never overwritten, and these junk bytes find themselves written to gschemas.compiled. On boot, when gdm and gnome-shell attempt to parse and load this corrupted gschemas.compiled file, it can't parse the junk bytes, and raises and error, which propagates up to a breakpoint in glib logging, but no debugger is present, so the kernel traps the breakpoint, and terminates the library, and the calling application, e.g. gdm. The result is that the user is left starting at a black screen with a blinking pointer. [Testcase] On a Focal system, simply run valgrind over glib-compile-schemas: # valgrind glib-compile-schemas /usr/share/glib-2.0/schemas You will get output like this, with the warning "Syscall param write(buf) points to uninitialised byte(s)": https://paste.ubuntu.com/p/hvZccwdzxz/ If you happen to have a large amount of gschema overrides present on your system, like my affected user does, you can save a copy of a generated gschema.compiled to your home directory and bindiff it against recompiles: # glib-compile-schemas /usr/share/glib-2.0/schemas # cp /usr/share/glib-2.0/schemas/gschema.compiled /home/ubuntu/schemas/gschemas.compiled # glib-compile-schemas /usr/share/glib-2.0/schemas # cmp -l /home/ubuntu/schemas/gschemas.compiled /usr/share/glib-2.0/schemas/gschemas.compiled | gawk '{printf "%08X %02X %02X\n", $1, strtonum(0$2), strtonum(0$3)}' 0000376F E3 C3 00003771 A4 98 If you install the test package from the following ppa: https://launchpad.net/~mruffell/+archive/ubuntu/sf311791-test When you run valgrind, it will report a clean run with no writing to uninitialised buffers, and all invocations of glib-compile-schemas will be deterministic, and generate the file same with the same sha256 hash every time. The unwritten bytes if you do a bindiff from before and after will be all set to zero. [Where problems can occur] I am doubtful that any programs are relying on buggy non-deterministic behaviour from random bytes found in uninitialised memory, so this should be a relatively safe change. Since we are updating glib, which all GNOME applications, gdm and gnome-shell link to, if we introduce an error, it could cause these applications to stop working, and at a worse case, see the symptoms this bug is trying to fix, which is a blinking cursor on a blank screen. Installing any updates to glib also causes the gsettings gschema cache to be re-generated, and from this bug, we know that libglib seems to trust the gschema.compiled file and doesn't perform much validation, if the user has bad data in their gschema files, it could lead to their systems having issues on next boot. If a regression occurs, users should first attempt to re-generate their schemas like so: glib-compile-schemas /usr/share/glib-2.0/schemas and if that fails, then they should downgrade their libglib2.0-0 libglib2.0-bin libglib2.0-data packages. [Other info] This was fixed by the commit: commit ea64c739239faea463f3cb9154a12cc4532ba525 Author: Philip Withnall <withnall@endlessm.com> Date: Wed Mar 18 09:15:59 2020 +0000 Subject: gvdb-builder: Initialise some memory to zero in the bloom filter Link: https://github.com/GNOME/glib/commit/ea64c739239faea463f3cb9154a12cc4532ba525 Only Focal needs this patch, Groovy and up are unaffected.
2021-07-12 05:06:25 Matthew Ruffell attachment added Debdiff for glib2.0 for Focal https://bugs.launchpad.net/ubuntu/+source/glib2.0/+bug/1930359/+attachment/5510466/+files/lp1930359_focal.debdiff
2021-07-12 05:06:45 Matthew Ruffell tags focal regression-update sts focal sts sts-sponsor
2021-07-12 05:39:47 Daniel van Vugt bug watch added https://gitlab.gnome.org/GNOME/gvdb/-/issues/2
2021-07-12 05:39:47 Daniel van Vugt bug task added glib
2021-07-12 05:42:14 Daniel van Vugt tags focal sts sts-sponsor fixed-in-2.65.0 fixed-upstream focal sts sts-sponsor
2021-07-13 21:02:56 Brian Murray glib2.0 (Ubuntu Focal): status In Progress Fix Committed
2021-07-13 21:02:58 Brian Murray bug added subscriber Ubuntu Stable Release Updates Team
2021-07-13 21:03:01 Brian Murray bug added subscriber SRU Verification
2021-07-13 21:03:06 Brian Murray tags fixed-in-2.65.0 fixed-upstream focal sts sts-sponsor fixed-in-2.65.0 fixed-upstream focal sts sts-sponsor verification-needed verification-needed-focal
2021-07-14 00:50:30 Matthew Ruffell tags fixed-in-2.65.0 fixed-upstream focal sts sts-sponsor verification-needed verification-needed-focal fixed-in-2.65.0 fixed-upstream focal sts verification-done verification-done-focal
2021-07-23 00:45:28 Chris Halse Rogers removed subscriber Ubuntu Stable Release Updates Team
2021-07-23 00:46:09 Launchpad Janitor glib2.0 (Ubuntu Focal): status Fix Committed Fix Released
2022-05-20 16:05:36 Bug Watch Updater glib: status Unknown Fix Released