gnome-shell segfault in g_strsplit() on RISC-V

Bug #2012068 reported by Heinrich Schuchardt
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
GNOME Shell
Fix Released
Unknown
gnome-shell (Ubuntu)
Fix Released
High
Daniel van Vugt

Bug Description

I have installed ubuntu-desktop on Ubuntu Lunar using a RISC-V system with Nvidia GT 710 as well as on a system with a Radeon 8450. In both cases gdm3 does not show a dialog but a message "Oh no! Something has gone wrong." Kinetic and Jammy work fine.

Daniel van Vugt (vanvugt) asked me to create a new bug based on the crash file replacing LP #2011271.

ProblemType: Crash
DistroRelease: Ubuntu 23.04
Package: gnome-shell 44~rc-1ubuntu2
Uname: Linux 5.19.0-1004-generic riscv64
Architecture: riscv64
CurrentDesktop: GNOME-Greeter:GNOME
Date: Fri Mar 17 13:27:47 2023
ExecutablePath: /usr/bin/gnome-shell
ExecutableTimestamp: 1678922222
ProcCmdline: /usr/bin/gnome-shell
ProcCwd: /var/lib/gdm3
ProcEnviron:
 LANG=en_US.UTF-8
 PATH=(custom, no user)
 SHELL=/bin/false
 XDG_RUNTIME_DIR=<set>
Signal: 11
SourcePackage: gnome-shell
UserGroups: N/A

Revision history for this message
Heinrich Schuchardt (xypron) wrote :
description: updated
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Waiting on the retracers... In the meantime, what does the journal say around the time of the crash?

Revision history for this message
Heinrich Schuchardt (xypron) wrote :

Here is the output of journalctl. The crash is in line 1807.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Still waiting on the retracers to analyse CoreDump.gz

BTW, I would expect Wayland to work at least as easily as Xorg. Any reason why Wayland seems to be disabled?

Revision history for this message
Heinrich Schuchardt (xypron) wrote :

apport-retrace shows the following:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/riscv64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/gnome-shell'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 __GI_strchr (s=s@entry=0x13ea <error: Cannot access memory at address 0x13ea>, c_in=c_in@entry=58) at ./string/strchr.c:47
47 ./string/strchr.c: No such file or directory.
[Current thread is 1 (Thread 0x3f9a6f7020 (LWP 1053))]
(gdb) up
#1 0x0000003f9dd4891e in __GI_strstr (haystack=0x13ea <error: Cannot access memory at address 0x13ea>, needle=0x2ae7f53538 ":") at ./string/strstr.c:84
84 ./string/strstr.c: No such file or directory.
(gdb) up
#2 0x0000003f9e213314 in g_strsplit () from /lib/riscv64-linux-gnu/libglib-2.0.so.0
(gdb) up
#3 0x0000002ae7f5303a in ?? ()
(gdb) up
Initial frame selected; you cannot go up.
(gdb)

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

It looks like there is code calling g_strsplit(0x13ea, ":",) and while there are calls of that form in both mutter and gnome-shell, only one in gnome-shell is suspicious:

  maybe_add_rpath_introspection_paths()

It's trying to do ELF parsing. To avoid that code path, try building gnome-shell without the option HAVE_EXE_INTROSPECTION.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Also try running 'mutter --wayland' in a VT so we can rule out mutter.

Revision history for this message
Heinrich Schuchardt (xypron) wrote :

I rebuilt the gnome shell package with this patch

Index: gnome-shell-44.0/meson.build
===================================================================
--- gnome-shell-44.0.orig/meson.build
+++ gnome-shell-44.0/meson.build
@@ -148,7 +148,7 @@ cdata.set('HAVE_FDWALK', cc.has_function
 cdata.set('HAVE_MALLINFO', cc.has_function('mallinfo'))
 cdata.set('HAVE_MALLINFO2', cc.has_function('mallinfo2'))
 cdata.set('HAVE_SYS_RESOURCE_H', cc.has_header('sys/resource.h'))
-cdata.set('HAVE_EXE_INTROSPECTION',
+cdata.set('HAVE_EXE_INTROSPECTION', false and
   cc.has_header('elf.h') and cc.has_header('link.h'))
 cdata.set('HAVE__NL_TIME_FIRST_WEEKDAY',
   cc.has_header_symbol('langinfo.h', '_NL_TIME_FIRST_WEEKDAY')

With this change gdm3 works as expected using X11.

Revision history for this message
Heinrich Schuchardt (xypron) wrote :

Wayland works fine too with the patch.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

\o/

Revision history for this message
Heinrich Schuchardt (xypron) wrote :

Should we check CMAKE_SYSTEM_PROCESSOR here and disable introspection on riscv64 only?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Before doing a workaround I want to see if upstream have some ideas:
https://gitlab.gnome.org/GNOME/gnome-shell/-/issues/6528

We could also investigate the offending function to see if it is sane for all architectures.

Revision history for this message
Heinrich Schuchardt (xypron) wrote :

I have been single stepping through maybe_add_rpath_introspection_paths() on riscv64.

strtab is set from this value of dyn:

dyn = {
 .d_tag = 0x5,
 .dun = { .d_val = 0xbe8, .d_ptr = 0xbe8},
}

dyn.dptr is not a valid pointer in memory.

obdump -S -D /usr/bin/gnome_shell shows

* section .dynstr starting at 0xbe8.
* section .dynamic starting at 0x4d50

But this does not imply that the sections are loaded to these addresses.
The value of _DYNAMIC on my system was 0x2aaaaed50 and not 0x4d50.

Revision history for this message
Heinrich Schuchardt (xypron) wrote :

On arm64 the value of dyn.dptr is a relocated value and not the one from the binary before relocation.

Revision history for this message
Heinrich Schuchardt (xypron) wrote :

For RISC-V and MIPS DL_RO_DYN_SECTION is defined as 1. Hence the .dynamic section is not relocated in glibc's function elf_get_dynamic_section().

See

sysdeps/riscv/dl-relocate-ld.h:23:
#define DL_RO_DYN_SECTION 1

sysdeps/mips/dl-relocate-ld.h:23:
#define DL_RO_DYN_SECTION 1

Revision history for this message
Heinrich Schuchardt (xypron) wrote :

According to the code comment "The dynamic section is readonly for ABI compatibility" we cannot expect glibc to change the behavior in future.

Cf. https://patchwork.ozlabs<email address hidden>/

Changed in gnome-shell (Ubuntu):
assignee: nobody → Daniel van Vugt (vanvugt)
status: New → In Progress
importance: Undecided → High
summary: - gdm3: "Oh no! Something has gone wrong."
+ gnome-shell segfault in g_strsplit() on RISC-V
Revision history for this message
Daniel van Vugt (vanvugt) wrote :
tags: added: riscv
Changed in gnome-shell:
status: Unknown → New
information type: Private → Public
Revision history for this message
Brian Murray (brian-murray) wrote :

For the record there are no retracers for riscv64 yet.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package gnome-shell - 44.0-2ubuntu3

---------------
gnome-shell (44.0-2ubuntu3) lunar; urgency=medium

  * Cherry-pick fix for installing extensions (LP: #2013073)
  * Add proposed patch to fix running gnome-shell on RISC-V (LP: #2012068)

 -- Jeremy Bicha <email address hidden> Fri, 31 Mar 2023 10:18:23 -0400

Changed in gnome-shell (Ubuntu):
status: In Progress → Fix Released
Changed in gnome-shell:
status: New → Fix Released
tags: added: fixed-in-gnome-shell-45.beta fixed-upstream
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.