libhwloc.so will segmentation fault when attempting to modify static string in environ

Bug #1968742 reported by Mark A. Grondona
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
hwloc (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Fix Released
Undecided
Matthew Ruffell
Kinetic
Fix Released
Undecided
Unassigned

Bug Description

[Impact]

hwloc sets the ZES_ENABLE_SYSMAN environment variable in environ using putenv. The issue is, it is a static string in environ, and if any applications that pull in libhwloc.so attempts to modify environ, they will segmentation fault. The fix is straightforward, by using setenv instead of putenv.

A workaround is to export ZES_ENABLE_SYSMAN=1 before running processes that use libhwloc.

[Testcase]

A minimal reproducer is readily available for this issue.

$ sudo apt install hwloc build-essential

Place the following C program into reproducer.c

#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void)
{
        void *lib = dlopen("libhwloc.so.15", RTLD_NOW);
        printf("dlopen: %p\n", lib);
        printf("getenv: %p\n", getenv("GNUTLS_NO_IMPLICIT_INIT"));

        printf("dlclose: %d\n", dlclose(lib));

        printf("getenv: %p\n", getenv("GNUTLS_NO_IMPLICIT_INIT"));
}

$ gcc -o reproducer reproducer.c

On a system effected by the issue, you will receive a segmentation fault on running:

$ ./reproducer
dlopen: 0x561a33e2f2d0
getenv: (nil)
dlclose: 0
Segmentation fault (core dumped)

Test packages are available in the following ppa:

https://launchpad.net/~mruffell/+archive/ubuntu/sf339146-test

If you install the test package, the output you should see is below, with no segmentation faults:

$ ./reproducer
dlopen: 0x5589ef8572d0
getenv: (nil)
dlclose: 0
getenv: (nil)

[Where problems could occur]

The change is small, as we swap from putenv to setenv to set a single environment variable, ZES_ENABLE_SYSMAN.

If a regression were to occur, affected users would be limited to those whose applications dynamically link against libhwloc.so, libhwloc.so.15.

A workaround for a regression would be to export ZES_ENABLE_SYSMAN=1 before running processes that use libhwloc.

[Other info]

Minimal reproducer and analysis is from:
https://bugs.schedmd.com/show_bug.cgi?id=14276

Upstream Bug report:
https://github.com/open-mpi/hwloc/pull/514

Commit which fixed the issue:

commit 91b9e44910f4fe4fb420a4064a646e2247c6de0e 2.7
From: Joshua Hursey <email address hidden>
Date: Sat, 29 Jan 2022 11:08:30 +0100
Subject: core+levelzero: Set ZES_ENABLE_SYSMAN via setenv instead of putenv
Link: https://github.com/open-mpi/hwloc/commit/91b9e44910f4fe4fb420a4064a646e2247c6de0e

Revision history for this message
Felix Abecassis (flx42) wrote :

+1 for this, I faced this issue while running the Slurm workload manager:
https://bugs.schedmd.com/show_bug.cgi?id=14276

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in hwloc (Ubuntu):
status: New → Confirmed
Revision history for this message
Mark A. Grondona (mgrondona) wrote :

FYI: A workaround for this issue is to export ZES_ENABLE_SYSMAN=1 before running processes that use libhwloc.

Revision history for this message
Felix Abecassis (flx42) wrote :

Yup, I mentioned it here: https://github.com/open-mpi/hwloc/pull/514#issuecomment-1152675130
This will be our workaround too, until the upgrade to hwloc 2.7.1

Changed in hwloc (Ubuntu Jammy):
status: New → Fix Released
Changed in hwloc (Ubuntu Kinetic):
status: Confirmed → In Progress
importance: Undecided → Medium
assignee: nobody → Matthew Ruffell (mruffell)
tags: added: sts
summary: - hwloc 2.7.0 putenv of static string
+ libhwloc.so will segmentation fault when attempting to modify static
+ string in environ
description: updated
Revision history for this message
Matthew Ruffell (mruffell) wrote :

Attached is a debdiff for hwloc on Jammy which fixes this issue.

description: updated
tags: added: sts-sponsor
Changed in hwloc (Ubuntu Kinetic):
status: In Progress → Fix Released
Changed in hwloc (Ubuntu Jammy):
status: Fix Released → In Progress
Changed in hwloc (Ubuntu Kinetic):
importance: Medium → Undecided
assignee: Matthew Ruffell (mruffell) → nobody
Changed in hwloc (Ubuntu Jammy):
assignee: nobody → Matthew Ruffell (mruffell)
Revision history for this message
Dan Streetman (ddstreet) wrote :

uploaded to jammy queue, thanks!

tags: added: sts-sponsor-ddstreet
Revision history for this message
Dan Streetman (ddstreet) wrote :

Matthew, the patch looked find to me so I uploaded it, but I just had a quick look at the wider package code and it looks like putenv() is used in many other places; do *all* those need to be replaced with setenv() or is there something special about this particular env var that needed to be fixed but the other uses of putenv() are ok?

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi Dan,

We don't need to change all instances of putenv() to setenv(), just the one we are changing is needed. This is the only one in hwloc itself using putenv(), all other instances are in tests and utils / tools.

As per the maintainer in https://github.com/open-mpi/hwloc/pull/514#issuecomment-1024642432:

>> You might want to scrub the HWLOC code and replace all instances of "putenv" with "setenv".\
> Don't worry, there are no putenv in libhwloc except this new one for L0. Others are in tools and tests, where putenv() is easier because it works on Windows.

We will stick with upstreams decision for this bug.

Thanks,
Matthew

Revision history for this message
Dan Streetman (ddstreet) wrote :

Sounds good thanks!

Revision history for this message
Robie Basak (racb) wrote : Please test proposed package

Hello Mark, or anyone else affected,

Accepted hwloc into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/hwloc/2.7.0-2ubuntu1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in hwloc (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-jammy
Revision history for this message
Matthew Ruffell (mruffell) wrote :

Performing verification on Jammy.

I installed hwloc 2.7.0-2 from -updates, along with build essential, and compiled the minimal reproducer c program from the testcase.

Upon running it, we segmentation fault:

$ ./reproducer
dlopen: 0x557f2027b2d0
getenv: (nil)
dlclose: 0
Segmentation fault (core dumped)

I then enabled -proposed, and installed hwloc 2.7.0-2ubuntu1. Running the same reproducer:

$ ./reproducer
dlopen: 0x56423a95c2d0
getenv: (nil)
dlclose: 0
getenv: (nil)

No more segmentation faults. Additionally, a user has tried this out in production and found it fixes their workloads that rely on hwloc libraries.

Happy to mark as verified for Jammy.

tags: added: verification-done-jammy
removed: verification-needed verification-needed-jammy
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package hwloc - 2.7.0-2ubuntu1

---------------
hwloc (2.7.0-2ubuntu1) jammy; urgency=medium

  * Set ZES_ENABLE_SYSMAN with setenv instead of putenv. When using
    putenv, a constant string was added to environ, and attempting to
    manipulate environ would result in a segmentation fault. (LP: #1968742)
    - d/p/lp1968742-core-levelzero-Set-ZES_ENABLE_SYSMAN-via-setenv.patch

 -- Matthew Ruffell <email address hidden> Thu, 16 Jun 2022 15:43:13 +1200

Changed in hwloc (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for hwloc has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.