openvdb6.2 is non functionnal on Focal due to jemalloc

Bug #1882998 reported by Guillaume Doisy on 2020-06-10
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
jemalloc (Debian)
New
Unknown
jemalloc (Ubuntu)
Medium
Unassigned
Focal
Medium
Unassigned
openvdb (Ubuntu)
Medium
Unassigned
Focal
Medium
Unassigned

Bug Description

See here : https://github.com/AcademySoftwareFoundation/openvdb/issues/732
It seems the issue is due to jemalloc.
Using openvdb7.0 the problem dispears, probably thanks to the following:

* d/control: Remove dependency to jemalloc. See #951704 for details

Would it be possible to remove the dependency to jemalloc on the focal version (6.2) or backport version 7.0 from groovy to focal ?

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: libopenvdb-dev 6.2.1-8ubuntu1
ProcVersionSignature: Ubuntu 5.4.0-33.37-generic 5.4.34
Uname: Linux 5.4.0-33-generic x86_64
ApportVersion: 2.20.11-0ubuntu27.2
Architecture: amd64
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
Date: Wed Jun 10 19:49:55 2020
InstallationDate: Installed on 2020-05-15 (26 days ago)
InstallationMedia: Ubuntu 20.04 LTS "Focal Fossa" - Release amd64 (20200423)
SourcePackage: openvdb
UpgradeStatus: No upgrade log present (probably fresh install)

Guillaume Doisy (doisyg) wrote :
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in openvdb (Ubuntu):
status: New → Confirmed
Changed in jemalloc (Ubuntu):
status: New → Confirmed
Changed in jemalloc (Debian):
status: Unknown → New
Richard Viney (richard-nz) wrote :

AFAICT this issue was introduced in OpenVDB 6.2.0 when it started linking in jemalloc all the time, including when being built as a shared library.

My understanding is that shared libraries should absolutely not be doing this, as safely linking allocators such as jemalloc into a shared library is very tricky to do correctly, and can also make it impossible to use the shared library dynamically via `dlopen`.

In my case I ran into this via OpenImageIO which had a dependency on libopenvdb.so.6.2, and I was getting the error "/usr/lib/x86_64-linux-gnu/libjemalloc.so.2: cannot allocate memory in static TLS block" when doing a `dlopen`.

OpenVDB addresses this as of v7.1.0, though this hasn't actually been released. The changelog [1] reads:

    - jemalloc/tbbmalloc are no longer linked into library artifacts of the
      OpenVDB CMake build. The CONCURRENT_MALLOC CMake option now only applies
      to the executables.

My understanding is that this is fixed in the proposed libopenvdb package for groovy which uses OpenVDB v7, but getting that working as a dependency of OpenImageIO on a focal system might be tricky. For my situation I'm intending to build OpenImageIO from source, link it statically, and exclude OpenVDB at build time because I don't actually need it anyway.

Hopefully there's a backport of the fix removing jemalloc to the OpenVDB 6.x series that can become part of an update to focal, or perhaps focal could use the proposed package for groovy.

[1] https://github.com/AcademySoftwareFoundation/openvdb/blob/master/CHANGES

Kai Kasurinen (kai-kasurinen) wrote :

openvdb (7.0.0-1) experimental; urgency=medium
...
  * d/control: Remove dependency to jemalloc. See #951704 for details
...

https://salsa.debian.org/multimedia-team/openvdb/-/commit/ac393d95aa19d29c23a97dca1ace23061ebe5c17

Kai Kasurinen (kai-kasurinen) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. However, I am closing it because the bug has been fixed in the latest development version of Ubuntu.

If you need a fix for the bug in previous versions of Ubuntu, please perform as much as possible of the SRU Procedure [1] to bring the need to a developer's attention.

[1]: https://wiki.ubuntu.com/StableReleaseUpdates#Procedure

Changed in openvdb (Ubuntu):
status: Confirmed → Fix Released
Richard Viney (richard-nz) wrote :

Thanks.

In my opinion this could warrant fixing in 20.04 LTS because:

- it's a regression from 18.04 LTS, so certain applications and usage patterns
  will break as users try to upgrade (which is what happened to me).
- it affects several packages, specifically: libopenimageio, libopenvdb and
  python3-openvdb. libopenimageio is widely used, and python3-openvdb is
  currently not able to be imported because of this issue. Any other libraries
  that depend on openvdb will also be affected.
- there's no need to upgrade to openvdb 7.x, and no code changes are required
  to the current 6.2.1 release either. It needs to be compiled with
  -DCONCURRENT_MALLOC=Tbballoc added to the CMake command line, and there's a
  patch available for this small change already.
- this wouldn't be a high-risk update.

I ended up working around my problem by setting the following environment
variable that causes jemalloc to be loaded at process start, so it doesn't try
to load itself later on in response to a dlopen() call.

    LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2

I've never worked directly with Ubuntu packages, but will take a look at the
SRU Procedure and see if I can put something together.

Richard Viney (richard-nz) wrote :

[Impact]

 * This issue causes apps to be unable to dlopen libopenvdb6.2 or any shared libraries that depend directly or indirectly on libopenvdb6.2. Notably this includes libopenimageio2.1 and anything that depends on it. The python3-openvdb package also doesn't work in focal at present because of this issue.

 * This issue is a good candidate for backporting to focal because (1) the fix is a very modest change to the build of the libopenvdb6.2 package; (2) this is a regression from the previous stable release of Ubuntu (bionic); (3) several packages are affected by it.

 * The change already in groovy fixes this issue by not building libopenvdb with jemalloc.

[Test Case]

 * `apt install python3-openvdb && python3 -c "import pyopenvdb"` should succeed, but because of this issue it currently fails with the error "cannot allocate memory in static TLS block".

 * Also, the following C program should print a non-nil value to stdout:

```c
#include <dlfcn.h>
#include <stdio.h>

void main() {
  printf("%p\n", dlopen("/usr/lib/x86_64-linux-gnu/libopenvdb.so.6.2", RTLD_NOW));
}
```

[Regression Potential]

 * Regressions are unlikely because the only change is to the choice of allocator for libopenvdb6.2 when configuring the build with CMake.

 * There is no need for a patch to libopenvdb.

 * The patch applied in groovy is https://salsa.debian.org/multimedia-team/openvdb/-/commit/ac393d95aa19d29c23a97dca1ace23061ebe5c17

[Other Info]

 * There are workarounds for this issue, but each has its own drawbacks and may not be possible in every situation where this issue could occur.

Richard Viney (richard-nz) wrote :

Please find attached a debdiff for focal

Mathew Hodson (mhodson) on 2020-09-25
Changed in jemalloc (Ubuntu):
importance: Undecided → Medium
Changed in jemalloc (Ubuntu Focal):
importance: Undecided → Medium
Changed in openvdb (Ubuntu):
importance: Undecided → Medium
Changed in openvdb (Ubuntu Focal):
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.