glib2.0: Uninitialised memory is written to gschema.compiled, failure to parse this file leads to gdm, gnome-shell failing to start
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
GLib |
Fix Released
|
Unknown
|
|||
glib2.0 (Ubuntu) |
Fix Released
|
Undecided
|
Matthew Ruffell | ||
Focal |
Fix Released
|
High
|
Matthew Ruffell |
Bug Description
[Impact]
A recent SRU of mutter 3.36.9-
After much investigation, mutter was not at fault. Instead, mutter-common calls the libglib2.0-0 hook on upgrade:
Processing triggers for libglib2.0-0:amd64 (2.64.6-
This in turn calls glib-compile-
Now, glib2.0 2.64.6-
# glib-compile-
# cmp -l /home/ubuntu/
0000376F E3 D0
00003771 A4 DB
# glib-compile-
# cmp -l /home/ubuntu/
0000376F E3 C3
00003771 A4 98
# glib-compile-
# cmp -l /home/ubuntu/
0000376F E3 68
00003771 A4 30
00003772 55 56
The bytes on the left are from a corrupted gschemas.compiled provided by an affected user. The changing bytes on the right are non-deterministic.
I ran valgrind over glib-compile-
https:/
What is happening is that a submodule of glib, gvdb, contains the logic for serialising the gschema data structures, and when it allocates a buffer to store the eventual gschemas.compiled file, it does not initialise it.
When we populate the fields in the buffer, some bytes are never overwritten, and these junk bytes find themselves written to gschemas.compiled.
On boot, when gdm and gnome-shell attempt to parse and load this corrupted gschemas.compiled file, it can't parse the junk bytes, and raises and error, which propagates up to a breakpoint in glib logging, but no debugger is present, so the kernel traps the breakpoint, and terminates the library, and the calling application, e.g. gdm.
The result is that the user is left starting at a black screen with a blinking pointer.
[Testcase]
On a Focal system, simply run valgrind over glib-compile-
# valgrind glib-compile-
You will get output like this, with the warning "Syscall param write(buf) points to uninitialised byte(s)":
https:/
If you happen to have a large amount of gschema overrides present on your system, like my affected user does, you can save a copy of a generated gschema.compiled to your home directory and bindiff it against recompiles:
# glib-compile-
# cp /usr/share/
# glib-compile-
# cmp -l /home/ubuntu/
0000376F E3 C3
00003771 A4 98
If you install the test package from the following ppa:
https:/
When you run valgrind, it will report a clean run with no writing to uninitialised buffers, and all invocations of glib-compile-
[Where problems can occur]
I am doubtful that any programs are relying on buggy non-deterministic behaviour from random bytes found in uninitialised memory, so this should be a relatively safe change.
Since we are updating glib, which all GNOME applications, gdm and gnome-shell link to, if we introduce an error, it could cause these applications to stop working, and at a worse case, see the symptoms this bug is trying to fix, which is a blinking cursor on a blank screen.
Installing any updates to glib also causes the gsettings gschema cache to be re-generated, and from this bug, we know that libglib seems to trust the gschema.compiled file and doesn't perform much validation, if the user has bad data in their gschema files, it could lead to their systems having issues on next boot.
If a regression occurs, users should first attempt to re-generate their schemas like so:
glib-compile-
and if that fails, then they should downgrade their libglib2.0-0 libglib2.0-bin libglib2.0-data packages.
[Other info]
This was fixed by the commit:
commit ea64c739239faea
Author: Philip Withnall <email address hidden>
Date: Wed Mar 18 09:15:59 2020 +0000
Subject: gvdb-builder: Initialise some memory to zero in the bloom filter
Link: https:/
Only Focal needs this patch, Groovy and up are unaffected.
Changed in mutter (Ubuntu Focal): | |
status: | New → In Progress |
importance: | Undecided → High |
assignee: | nobody → Matthew Ruffell (mruffell) |
tags: | added: regression-update sts |
description: | updated |
tags: | added: focal |
summary: |
- gdm fails to start with latest mutter 3.36.9-0ubuntu0.20.04.1 in focal- - updates + gdm fails to start in a VMware Horizon VDI environment with latest + mutter 3.36.9-0ubuntu0.20.04.1 in focal-updates |
Changed in glib2.0 (Ubuntu): | |
assignee: | nobody → Matthew Ruffell (mruffell) |
Changed in mutter (Ubuntu): | |
assignee: | nobody → Matthew Ruffell (mruffell) |
Changed in glib2.0 (Ubuntu Focal): | |
assignee: | nobody → Matthew Ruffell (mruffell) |
no longer affects: | mutter (Ubuntu) |
no longer affects: | mutter (Ubuntu Focal) |
Changed in glib2.0 (Ubuntu): | |
status: | New → Fix Released |
Changed in glib2.0 (Ubuntu Focal): | |
status: | New → In Progress |
importance: | Undecided → High |
summary: |
- gdm fails to start in a VMware Horizon VDI environment with latest - mutter 3.36.9-0ubuntu0.20.04.1 in focal-updates + glib2.0: Uninitialised memory is written to gschema.compiled, failure to + parse this file leads to gdm, gnome-shell failing to start |
description: | updated |
tags: | added: fixed-in-2.65.0 fixed-upstream |
Changed in glib: | |
status: | Unknown → Fix Released |
I built a test package based on mutter 3.36.9- 0ubuntu0. 20.04.1, and reverted the three commits introduced by LP #1905825, namely:
commit: 92834d8feceeac5 38299a47a8c742e 155de4e6e8 /gitlab. gnome.org/ GNOME/mutter/ -/commit/ 92834d8feceeac5 38299a47a8c742e 155de4e6e8
From: Kai-Heng Feng <email address hidden>
Date: Mon, 21 Dec 2020 14:34:43 +0800
Subject: renderer/native: Refactor modeset boilerplate into new helpers
Link: https:/
commit: 097af7ddb381606 da74c737859cc92 fff72ed417 manager- kms: Disable CRTCs if there is no monitor /gitlab. gnome.org/ GNOME/mutter/ -/commit/ 097af7ddb381606 da74c737859cc92 fff72ed417
From: Kai-Heng Feng <email address hidden>
Date: Mon, 21 Dec 2020 14:59:32 +0800
Subject: monitor-
Link: https:/
commit: 93f3ce3c305571b fc39f6d9e5d221e 1b60a920a4 Mon Sep 17 00:00:00 2001 manager- xrandr: Disable CRTCs if there is no monitor /gitlab. gnome.org/ GNOME/mutter/ -/commit/ 93f3ce3c305571b fc39f6d9e5d221e 1b60a920a4
From: Kai-Heng Feng <email address hidden>
Date: Fri, 13 Nov 2020 14:19:26 +0800
Subject: monitor-
Link: https:/
The testpackage is available in [1]:
[1] https:/ /launchpad. net/~mruffell/ +archive/ ubuntu/ sf311791- test
I provided the testpackage to the affected user, and they installed it in a test VDI on their VMware Horizon environment, and wrote back that it works.
So, it seems to be the CRTc changes from bug 1905825.
The affected user's environment uses Nvidia GRID GPUs, passed into the VDI. They are using the Nvidia GRID 450.51.05 driver.
I have been trying to reproduce this for a few days now. Google Cloud has an option to pass a Nvidia P4 GPU into the instance, and turn on Nvidia GRID, so I have been using that platform.
I have tried with gpu, without gpu, multiple versions of mutter, with and without VMware Horizon Viewagent 7.13, but in each of my runs, I see gdm3 running, and I see the usual processes that get spawned off of it. I have tried multiple versions of the Nvidia GRID driver, available from [2], but they appear to act the same.
[2] https:/ /cloud. google. com/compute/ docs/gpus/ grid-drivers- table
I think I am missing something, but I'm not sure what. I have tried using the "virtual display" feature with and without, and when it is enabled and I view the screenshot, I just see a black screen with a blinking cursor, no matter what mutter package I have installed.
It could be a virtual hotplug might be needed, but I'm not sure how that can be achieved on Google Cloud, or how it would happen on the affected user's VMware Horizon on VMware ESXi environment.
I will ask the affected user for some more logs, so we can better see what is going on.