Snap applications segfault with new core20 (rev: 1015+)

Bug #1926355 reported by Łukasz Zemczak on 2021-04-27
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Snapcraft
Undecided
Unassigned
snap-core20
Critical
Unassigned
glibc (Ubuntu)
Undecided
Balint Reczey
Focal
Undecided
Unassigned
Groovy
Undecided
Unassigned
Hirsute
Undecided
Unassigned

Bug Description

[Impact]

* Core20 snap built with updated glibc crashes snaps also bundling glibc.

[Test Plan]

* Build core20 snap with glibc in focal-proposed. Test a snap (which is not core20) bundling glibc:

TODO: install locally built core20?
snap install test-snapd-rsync-core20 --edge
snap run test-snapd-rsync-core20.rsync

[Where problems could occur]

* The previous glibc update (2.31-0ubuntu9.3) had a fix (LP: #1914044) that broke snaps bundling a previous version of glibc (2.31-0ubuntu9.2) due to them being incompatible. The fix of LP: #1914044 is reverted and 2.31-0ubuntu9.4 does not include changes incompatible with 2.31-0ubuntu9.2, thus the crash should not occur. No problems are expected.

[Original Bug Text]
It seems that with our new core20 in the beta channel all snaps seem to be segfaulting. We recently had a new glibc landed in focal-updates - might be related.

Revision history for this message
Steve Langasek (vorlon) wrote :

backtrace needed

Changed in glibc (Ubuntu):
status: New → Incomplete
Revision history for this message
Ian Johnson (anonymouse67) wrote :

With the test snap that uses core20 as it's base, test-snapd-rsync-core20 (installable on the edge channel), I see it segfaulting when running the snap on both a UC20 system with the core20 snap as a base snap, as well as on my groovy desktop:

https://pastebin.ubuntu.com/p/qbq86DYw5Q/

You can reproduce this with:
```
snap install core20 --beta || snap refresh core20 --beta
snap install test-snapd-rsync-core20 --edge
snap run --experimental-gdbserver test-snapd-rsync-core20.rsync

... you will see the gdb command to use to connect to the gdbserver running inside the snap's mount namespace
gdb -ex="target remote :33021" -ex=continue -ex="signal SIGCONT" # in another window
```

My gdb output seems to imply the segfault is coming from 0x000056287d57a2d4 in time@plt ? I haven't been able to load debug symbols for this gdb version into the snap mount namespace yet so I don't have more info, but presumably you could copy debug symbols into the snap's dir somewhere like $HOME/snap/test-snapd-rsync-core20/current/debug.sym and then load it from the gdb shell.

For reference, I've also attached the output of strace too in case that's more useful: https://pastebin.ubuntu.com/p/rpqKrnBHrg/

Changed in glibc (Ubuntu):
status: Incomplete → New
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Balint will be looking into it. For now we decided to pull the latest glibc update from focal-updates back to focal-proposed.

Revision history for this message
Balint Reczey (rbalint) wrote :

Thank you for the bug report.

The update has been reverted, please downgrade glibc binary packges to 2.31-0ubuntu9.2 until the new update becomes available.

The problem seems to be caused by the fix for LP: #1914044.

Balint Reczey (rbalint) on 2021-04-28
Changed in glibc (Ubuntu):
assignee: nobody → Balint Reczey (rbalint)
Revision history for this message
Ioanna Alifieraki (joalif) wrote :

@Balint not sure if you're already aware but the regression caused by LP: #1914044 may be causing
the problem in LP: #1867502 .
Earlier today people reported failed deployments with netinstall, autoinstalls etc,
which is now working again (I guess because 2.31-0ubuntu9.3 was pulled out of -updates).

Revision history for this message
Balint Reczey (rbalint) wrote :

@joalif Thanks, I've marked it as a duplicate of an other similar issue with the installer that i've already commented on.
BTW test-snapd-rsync-core20 does not work on Groovy and Hirsute either and I'm surprised that no one reported that yet.

tags: added: regression-update
Revision history for this message
Balint Reczey (rbalint) wrote :

Strangely I can reproduce the crash in a newly created multipass focal VM with the old glibc (2.31-0ubuntu9.2) from focal-updates.

Revision history for this message
Balint Reczey (rbalint) wrote :

OK, I've tested it in clean multipass VMs and test-snapd-rsync-core20.rsync does not work on Bionic, Focal and later. Interestingly it ships a local copy of libc6 inside that could be the problem and it worked on Focal for some time due to accidentally matching the host's libc6.

...
Reading target:/usr/lib/debug/snap/test-snapd-rsync-core20/11/lib/x86_64-linux-gnu//libc-2.31.so from remote target...

Program received signal SIGSEGV, Segmentation fault.
0x000055eb84b7b2d4 in time@plt ()
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x00007f070255d100 0x00007f070257f7c4 Yes (*) target:/lib64/ld-linux-x86-64.so.2
                                        No linux-vdso.so.1
0x00007f07025514f0 0x00007f07025557e8 Yes (*) target:/snap/test-snapd-rsync-core20/11/usr/lib/x86_64-linux-gnu/libacl.so.1
0x00007f0702543720 0x00007f070254a92d Yes (*) target:/snap/test-snapd-rsync-core20/11/usr/lib/x86_64-linux-gnu/libpopt.so.0
0x00007f0702374630 0x00007f07024e908f Yes (*) target:/snap/test-snapd-rsync-core20/11/lib/x86_64-linux-gnu/libc.so.6
(*): Shared library is missing debugging information.

For the record test-snapd-rsync-core18.rsync does not ship and internal libc copy and does work on all releases I tried (Bionic, Focal).

Changed in glibc (Ubuntu):
status: New → Invalid
Changed in glibc (Ubuntu Groovy):
status: New → Invalid
Changed in glibc (Ubuntu Hirsute):
status: New → Invalid
Balint Reczey (rbalint) on 2021-05-05
Changed in glibc (Ubuntu):
status: Invalid → New
Changed in glibc (Ubuntu Groovy):
status: Invalid → New
Changed in glibc (Ubuntu Hirsute):
status: Invalid → New
Revision history for this message
Balint Reczey (rbalint) wrote :

OK, so core20 (1015) bundles libc6 2.31-0ubuntu9.3 which has been removed from updates. Please build a new core20 with libc6 2.31-0ubuntu9.2 which is currently in focal-updates.

Revision history for this message
Balint Reczey (rbalint) wrote :

Core20 (1026) now works, but I believe shipping libc in test-snapd-rsync-core20, too, is not healthy and will break again when core20's glibc gets upgraded.

Revision history for this message
Ian Johnson (anonymouse67) wrote :

Sure, I was not aware test-snapd-rsync-core20 was shipping glibc, that is indeed not a good idea.

I went looking on my system for other snaps which experienced the crash, and it seems that every snap that ships glibc in it crashes with the beta channel of core20, but snaps that (properly) do not ship libc6 in them do not crash. For example these other well known snaps ship glibc in them:

* matterhorn
* okular
* htop

and some others that are perhaps less well known. So I think it is unfortunately a bit common to do this even though it is not advisable.

Sergio, do you know why these snaps would have libc6 staged in them? Matterhorn for example does not declare libc6 as a stage-package, yet it is listed as a primed-stage-packages in the manifest.yaml:

```snapcraft.yaml
    stage-packages:
      - libatomic1
      - libsecret-tools
      - libnotify-bin
      - xclip
```

```manifest.yaml
primed-stage-packages:
- libc6=2.31-0ubuntu9.2
```

Revision history for this message
Sergio Schvezov (sergiusens) wrote :

Hi Ian, thanks for raising this. Those would need a rebuild to be mostly ok, we had a release time bug which we have since fixed https://github.com/snapcore/snapcraft/commit/0bf7a2e6619b0037a50caeb49d28788c021d0921

If using Snapcraft 4.6.1 this should no longer be the issue for core20.

Revision history for this message
Balint Reczey (rbalint) wrote :

@anonymouse67 With glibc removed from the snaps other than core20 they should be working OK with core20 1016 shipping 2.31-0ubuntu9.3. Could you please confirm that?

Revision history for this message
Ian Johnson (anonymouse67) wrote :

Unfortunately I don't know how to easily remove glibc from the snaps in a way that would confirm that they work, I don't have time to manually build all of these snaps that are broken, I tried the basic thing of unpacking the snap and `rm -rf ./lib/x86_64-linux-gnu/libc-2.31.so ./lib/x86_64-linux-gnu/libc.so.6` and then repacking and installing these snaps, but then the still segfault and fail with:

$ snap run htop
*** stack smashing detected ***: terminated
Aborted (core dumped)

Which I don't know if that's because I didn't fully remove traces of glibc from the snap or if it's because beta version of core20 (snap revision 1015) is still broken.

I did try building the matterhorn snap since I found the source for it at https://github.com/popey/matterhorn-snap.git, but that doesn't seem to build at all.

Perhaps Sergio can help confirm if these snaps work if rebuilt without libc6 getting staged into the snap?

Revision history for this message
Alkis Georgopoulos (alkisg) wrote :

Hi, the libc update + removal caused the following issue:

1) On 2021-04-26, libc version 2.31-0ubuntu9.3 got uploaded to Ubuntu Focal.
2) Many people updated to it.
3) Two days later, on 2021-04-28, it got removed because it was causing the issues descripted in this bug report.
4) If any of the persons that updated, tries to install libc6-dev now, it says it's not installable as it depends on libc6=2.31-0ubuntu9.2, while 2.31-0ubuntu9.3 is installed.

In some cases they can downgrade with `apt install libc6=2.31-0ubuntu9.2`, but that's not always easy when additional dependencies are involved.

I believe a solution would be to re-upload 2.31-0ubuntu9.2 as 2.31-0ubuntu9.4. This would then not break snap, while allowing people to install libc6-dev, as it wouldn't have a lower version than in the archives anymore.

Balint Reczey (rbalint) on 2021-06-01
description: updated
Changed in glibc (Ubuntu Groovy):
status: New → Fix Released
Changed in glibc (Ubuntu Hirsute):
status: New → Fix Released
Changed in glibc (Ubuntu):
status: New → Fix Released
Changed in glibc (Ubuntu Focal):
status: New → In Progress
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers