dotnet build intermittently crashes with segfault on Ubuntu 18.04

Bug #1983100 reported by Nicolas Bock
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ubuntu Pro
Status tracked in 18.04
18.04
Fix Released
Medium
Tom Moyer
openssl (Ubuntu)
Fix Released
Undecided
Unassigned
Bionic
Won't Fix
Undecided
Unassigned

Bug Description

[Impact]

Bionic's OpenSSL 1.1.1 package (https://launchpad.net/ubuntu/bionic/+source/openssl) is the only version of openssl 1.1.1 on any distro that we've encountered that does not have support for the OPENSSL_NO_ATEXIT functionality from 1.1.1b (openssl/openssl@c2b3db2).

The threading model in .NET has the possibility that background threads are still running when exit() is called, which can cause SIGSEGV if a background thread interacts with OpenSSL after/while it has unloaded. For that reason, we always initialize OpenSSL 1.1.1 with the OPENSSL_NO_ATEXIT flag (which, of all the distros we run on only has no effect on Bionic).

We feel that the stability of applications on Ubuntu 18.04 would be improved if the functionality of OPENSSL_NO_ATEXIT was merged into the bionic openssl 1.1.1 package, even if the constant isn't published into the header for the dev package.

Context: https://github.com/dotnet/runtime/issues/48411#issuecomment-1178405101

[Test Plan]

The described behavior can be reproduced by passing the OPENSSL_NO_ATEXIT to the OPENSSL_init_ssl() call. The application will terminate with a SEGFAULT. More concretely, a minimal reproducer is:

#include <stdio.h>
#include <openssl/err.h>
#include <openssl/ssl.h>

#ifndef OPENSSL_INIT_NO_ATEXIT
#define OPENSSL_INIT_NO_ATEXIT 0x00080000L
#endif

static void print_error_string()
{
    printf("print_error_string:\n");
    printf("ERR_reason_error_string(0) => %s\n", ERR_reason_error_string(0));
}

int main(int argc, char* argv[])
{
    // register this handler first, so it runs last.
    atexit(print_error_string);

    OPENSSL_init_ssl(
            OPENSSL_INIT_ADD_ALL_CIPHERS |
            OPENSSL_INIT_ADD_ALL_DIGESTS |
            OPENSSL_INIT_LOAD_CONFIG |
            OPENSSL_INIT_NO_ATEXIT |
            OPENSSL_INIT_LOAD_CRYPTO_STRINGS |
            OPENSSL_INIT_LOAD_SSL_STRINGS,
        NULL);

    print_error_string();

    return 0;
}

Building

$ sudo apt install libssl-dev
$ gcc test.c -lssl -lcrypto
$ ./a.out
print_error_string:
ERR_reason_error_string(0) => (null)
print_error_string:
Segmentation fault (core dumped)

[Other Info]
All of these patches are included in upstream release 1.1.1b
- lp1983100-0001-Fix-shlibloadtest-to-properly-execute-the-dso_ref-te.patch
  Fixes the shlibloadtest that was updated as part of #0005

- lp1983100-0002-Implement-OPENSSL_INIT_NO_ATEXIT.patch
  Patch adds the OPENSSL_INIT_NO_ATEXIT option

- lp1983100-0003-Don-t-link-shlibloadtest-against-libcrypto.patch
  Additional fixes for shlibloadtest

- lp1983100-0004-Fix-rpath-related-Linux-test_shlibload-failure.patch
  Additional fixes for shlibloadtest

- lp1983100-0005-Test-atexit-handlers.patch
  Adds test for OPENSSL_INIT_NO_ATEXIT option and updates the shlibloadtest

- lp1983100-0006-Introduce-a-no-pinshared-option.patch
  This patch includes tests to ensure that if OPENSSL_INIT_NO_ATEXIT is not defined then the atexit() handler is run

- lp1983100-0007-Support-_onexit-in-preference-to-atexit-on-Windows.patch
  This patch ensures that atexit() is only called when on non-Windows systems as Windows uses _onexit() during library unloading

All seven patches are required to ensure the correct logic and operation of the OPENSSL_INIT_NO_ATEXIT option.

[Where problems could occur]

The patches adds an option to the OPENSSL_init_crypto() function to disable the default behavior of calling of a cleanup function on application exit. The patch also includes a few bug fixes around various initializations that were supposed to happen once when running threaded but were not.

These changes have the potential for regressions and it is conceivable that they lead to incorrect behavior. However, I have also backported and included all new testing functions in the hope that the changed behavior will get appropriate testing.

Revision history for this message
Nicolas Bock (nicolasbock) wrote :
description: updated
description: updated
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Nicolas and I talked about a brief review of the debdiff,
and he kindly agreed to provide an update/v2 to address:
- changelog version and description line
- patches dep3 headers and lpnumber- prefix

... while patches are reviewed for feedback (not yet done).

Thanks, Nicolas!

tags: added: se-sponsor-halves
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

Hi Nicolas,

Thank you for the patches! I've taken a quick look at the contents, in hopes of giving you some more feedback for the v2 (in addition to Mauricio's already suggested changes).

Going through the list, I noticed that there are quite a few patches for a single bug fix, and that some of them have a substantial diffstat. There's an impression that these patches are more targeted towards including new features rather than fixing specific bugs, given their complexity and level of
changes. From the description, I understand most of these changes are probably required in order to fix faulty behavior, so we should try to get more information about each patch and what they intend to fix.

Without further context, it's not straightforward to understand why some of the patches are necessary. As an example, patch 6/7 seems to introduce a new option related to memory pinning, and patch 7/7 seems to be a Windows-specific change. It would be very helpful if you could describe your reasoning for each patch in more detail (e.g. with the "[Other Info]" section of the SRU Bug
Template), something like below:
- patch 0001 is a hard dependency of OPENSSL_INIT_NO_ATEXIT
- patch 0002 implements OPENSSL_INIT_NO_ATEXIT
- patch 0003-0004 add OPENSSL_INIT_NO_ATEXIT to the openssl test suite
...
This would make it easier to understand why each patch included, and help deciding on whether they would be appropriate for the SRU scenario.

In addition to that, it would be great to have more concrete examples in the "[Where problems could occur]" section. You mentioned that the patches could lead to incorrect behavior, so having examples of that would be helpful in spotting them once the changes go through. For instance, are these problems expected to show up in openssl itself? Would an external library or program
using openssl start exhibiting different behavior or crashes due to these changes?

I don't want to come off as excessive, but given that openssl is a somewhat 'hot' package with potentially big impact on systems, having good justification and proper context on the patches would be very beneficial when evaluating the changes.

Thanks again for the help on fixing this bug, and let us know if you have any questions!

Changed in openssl (Ubuntu Bionic):
status: New → Incomplete
Changed in openssl (Ubuntu):
status: New → Fix Released
Changed in openssl (Ubuntu Bionic):
importance: Undecided → Medium
assignee: nobody → Nicolas Bock (nicolasbock)
Tom Moyer (tom-tom)
Changed in openssl (Ubuntu Bionic):
assignee: Nicolas Bock (nicolasbock) → Tom Moyer (tom-tom)
Revision history for this message
Tom Moyer (tom-tom) wrote :

This updated debdiff (openssl-atexit-v2.debdiff) includes the DEP-3 headers, cleans up the patch names by prepending lp1983100- to each patch, and is rebased on the latest release in bionic-security.

description: updated
Changed in openssl (Ubuntu Bionic):
status: Incomplete → In Progress
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

Thanks for the revised debdiff, Tom! I appreciate that you added a short description of each patch to the bug, it's really helpful!

We'll need to adjust a few things before proceeding, a few comments below.

On the debian changelog:
- Since we have many patch files, it would be good to pull the descriptive entry to the top, and list the files underneath
- We should include the LP bug number in the new changelog entry. This should be in the format "(LP: #1983100)", preferably at the end of your topmost entry

On the DEP-3 headers:
- let's point the Origin tags to the commit id of each change, so that a reviewer will be able to find the original commits in a local repo (without going through a github pull request), e.g.:
Origin: upstream, https://github.com/openssl/openssl/commit/56806f432b6c
- for the Bug-Ubuntu tag, we should use the short-link version:
Bug-Ubuntu: https://bugs.launchpad.net/bugs/1983100

On the patches themselves:

I've noticed that patch 4/7 isn't from the same PR series as the rest (it's from PR 7626 [0]), so we should correct the Origin header in that patch file. PR 7626 also has another commit that doesn't seem to be included in the debdiff, did you run tests to ensure it isn't needed for them to work properly?
[0] https://github.com/openssl/openssl/pull/7626

I'm also not fully convinced patch 7/7 is required for this SRU, as it's specifically targeted to Windows systems. I don't think our builds define _WIN32, so I'd imagine the patch would only introduce dead code. Could we drop it from the next version?

Finally, I'd like to double check the patches that are focused on shlibloadtest (patches 1, 3-7). I understand some of them aren't related to the new NO_ATEXIT change, but are fixing the underlying test that's used by that specific patch. Does this mean that shlibloadtest is currently broken for bionic? Or are these patches needed for the additional tests introduced by patch 5?

Ultimately, I wonder if it would be appropriate to e.g. spin the shlibloadtest fixes into a separate LP bug, or if the NO_ATEXIT tests will just outright break without them. Could you please confirm whether this is the case?

Thanks again for all the work on this, Tom! Given we're touching a critical component (openssl) of an LTS release that's soon to move out of standard support, we'll have to be a bit more strict than usual and this SRU is likely to receive extra attention, so thanks for being understanding about it.

Cheers,
Heitor

Changed in openssl (Ubuntu Bionic):
status: In Progress → Incomplete
Tom Moyer (tom-tom)
Changed in ubuntu-pro:
assignee: nobody → Tom Moyer (tom-tom)
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

Marking Bionic as "Won't Fix", as it's out of standard support.

This issue is now being tracked for Ubuntu Pro 18.04.

Changed in openssl (Ubuntu Bionic):
status: Incomplete → Won't Fix
Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

This bug was fixed in the package openssl - 1.1.1-1ubuntu2.1~18.04.23+esm1

openssl (1.1.1-1ubuntu2.1~18.04.23+esm1) bionic; urgency=medium

  * Include support for OPENSSL_NO_ATEXIT functionality introduced in
    OpenSSL 1.1.1b which prevents OpenSSL from being cleaned up when exit() is
    called. This prevents .NET applications from segfaulting
    - d/p/lp1983100-0001-Implement-OPENSSL_INIT_NO_ATEXIT.patch
    (LP: #1983100)

 -- Tom Moyer <email address hidden> Wed, 05 Jul 2023 16:10:39 +0000

Changed in openssl (Ubuntu Bionic):
importance: Medium → Undecided
assignee: Tom Moyer (tom-tom) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.