Ubuntu
mumax3 package

mumax3 test suite fails against glibc 2.38

Bug #2032624 reported by Simon Chopin on 2023-08-22

This bug affects 1 person

	Status	Importance	Assigned to
GLibC	New	Medium	sourceware-bugs #30909
Ubuntu	Fix Released	Undecided	Unassigned
aspectc++ (Debian)	New	Unknown	debbugs #1070443
aspectc++ (Ubuntu)	New	Undecided	Unassigned
cbmc (Debian)	Confirmed	Unknown	debbugs #1070441
cbmc (Ubuntu)	Fix Released	Undecided	Unassigned
cxref (Debian)	Fix Released	Unknown	debbugs #1070444
cxref (Ubuntu)	Fix Released	Undecided	Unassigned
gauche-c-wrapper (Ubuntu)	New	Undecided	Unassigned
glibc (Ubuntu)	Won't Fix	Medium	Unassigned
mumax3 (Ubuntu)	Fix Released	Critical	Unassigned
nvidia-nccl (Ubuntu)	Fix Released	Undecided	Unassigned
pyvkfft (Ubuntu)	Fix Released	Undecided	Unassigned
rocm-hipamd (Debian)	New	Unknown	debbugs #1070446
rocm-hipamd (Ubuntu)	Fix Released	Undecided	Unassigned
stdgpu-contrib (Ubuntu)	New	Undecided	Unassigned

Bug Description

The autopkgtests fail with the following error:

921s nvcc -std=c++03 -ccbin=/usr/bin/cuda-gcc --compiler-options -Werror --compiler-options -Wall -Xptxas -O3 -ptx -arch=compute_50 -code=sm_50 copypadmul2.cu -o copypadmul2_50.ptx
922s /usr/include/aarch64-linux-gnu/bits/math-vector.h(30): error: identifier "__Float32x4_t" is undefined
922s
922s /usr/include/aarch64-linux-gnu/bits/math-vector.h(31): error: identifier "__Float64x2_t" is undefined
922s
922s /usr/include/aarch64-linux-gnu/bits/math-vector.h(40): error: identifier "__SVFloat32_t" is undefined
922s
922s /usr/include/aarch64-linux-gnu/bits/math-vector.h(41): error: identifier "__SVFloat64_t" is undefined
922s
922s /usr/include/aarch64-linux-gnu/bits/math-vector.h(42): error: identifier "__SVBool_t" is undefined

Marking as critical as this blocks the glibc transition.

Tags:

Related branches

~slavik81/ubuntu/+source/glibc:fix-rocm-hipamd-ftbfs

Ready for review for merging into ubuntu/+source/glibc:ubuntu/devel

Graham Inggs (community): Needs Information on 2024-05-06

git-ubuntu import: Pending requested 2024-04-10

CVE References

Simon Chopin (schopin) on 2023-08-22

Changed in glibc (Ubuntu):
importance:	Undecided → Critical
tags:	added: update-excuse
tags:	added: foundations-todo

Revision history for this message

Mitchell Dzurick (mitchdz) wrote on 2023-08-23:

This commit https://sourceware.org/git/?p=glibc.git;a=commit;h=cd94326a1326c4e3f1ee7a8d0a161cc0bdcaf07e added the file `sysdeps/aarch64/fpu/bits/math-vector.h.

On a mantic system, the header file gets placed at /usr/include/aarch64-linux-gnu/bits/math-vector.h, which used to do only a single thing for aarch64, which was:
#include <bits/libm-simd-decl-stubs.h>

And after the commit, a few types are added such as

#if __GNUC_PREREQ(9, 0)
# define __ADVSIMD_VEC_MATH_SUPPORTED
typedef __Float32x4_t __f32x4_t;
typedef __Float64x2_t __f64x2_t;
...

Simply commenting out the new types is enough to fix this issue, but completely removing the newly added support for libmvec is not a great idea.

Perhaps nvidia-cuda-toolkit-gcc needs to be rebuilt with support for these types?

Revision history for this message

Graham Inggs (ginggs) wrote on 2023-08-23:

The nvidia-cuda-toolkit-gcc package only contains the /usr/bin/cuda-g++ and /usr/bin/cuda-gcc wrappers and has a dependency on the highest supported g++, currently g++-12.

See: https://packages.ubuntu.com/mantic/devel/nvidia-cuda-toolkit-gcc

Revision history for this message

Mitchell Dzurick (mitchdz) wrote on 2023-08-23:

Tried a no-change rebuild of nvidia-cuda-toolkit (https://launchpad.net/~mitchdz/+archive/ubuntu/nvidia-cuda-toolkit-mantic-merge) using the proposed archive and that did not solve the problem.

Revision history for this message

Mitchell Dzurick (mitchdz) wrote on 2023-08-23:

Ah I posted my comment right after your ginggs. Thanks for the pointer! You're right, on my system cuda-gcc just points to gcc-12

$ ll $(which /usr/bin/cuda-gcc)
lrwxrwxrwx 1 root root 6 Aug 23 14:17 /usr/bin/cuda-gcc -> gcc-12*

I tried using gcc-13 instead as I would hope that version would see these new types, but I'm still seeing __Float32x4_t undefined, in addition to some new types being undefined

nvcc -std=c++11 -ccbin=/usr/bin/g++-13 --allow-unsupported-compiler --compiler-options -Werror --compiler-options -Wall -Xptxas -O3 -ptx -arch=compute_50 -code=sm_50 copypadmul2.cu -o copypadmul2_50.ptx
...
/usr/include/stdlib.h(147): error: identifier "_Float64" is undefined
/usr/include/stdlib.h(153): error: identifier "_Float128" is undefined
/usr/include/stdlib.h(159): error: identifier "_Float32x" is undefined
/usr/include/stdlib.h(165): error: identifier "_Float64x" is undefined
...

Also another note, these particular CUDA code snippets don't really need these types, so finding a way to not include them will work (maybe patching libc6-dev to include another preprocessor directive) but I think ultimately that's a bad idea because someone could want a .cu file that uses both arm SIMD extensions in addition to the CUDA code.

Revision history for this message

Graham Inggs (ginggs) wrote on 2023-08-25:

Some similar reports I found (although from some years ago):

https://forums.developer.nvidia.com/t/nvcc-compilation-errors-on-24-2-l4t-platform-tx1/45937

https://github.com/InsightSoftwareConsortium/ITK/issues/1959

"The user space in R23.x is 32-bit. NEON is also from the 32-bit compatibility mode that makes ARMv8 able to execute armhf. The errors tend to imply that some 32-bit compatibility mode library for NEON is missing."

Seems to imply some mismatch between NEON (32-bit) and arm64?

Revision history for this message

Graham Inggs (ginggs) wrote on 2023-08-25:

We'll ignore this failure and allow glibc to migrate, and that does not preclude further investigation.

Note that mumax3/arm64 is not built in Debian, and did not built in jammy, so we may end up removing the arm64 binary.

Revision history for this message

Matthias Klose (doko) wrote on 2023-08-28:

Removing packages from mantic:
mumax3 3.10-8 in mantic arm64
Comment: LP: #2032624, remove mumax3 binary on arm64
1 package successfully removed.

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2023-09-04:

This seems to be causing bug 2033747 too.

Revision history for this message

Graham Inggs (ginggs) wrote on 2023-09-20:

This also seems to cause nvidia-nccl to FTBFS on arm64 in the test rebuild

https://people.canonical.com/~ginggs/ftbfs-report/test-rebuild-20230830-mantic-mantic.html

tags:

added: ftbfs

Revision history for this message

Graham Inggs (ginggs) wrote on 2023-09-20:

#10

cxref also FTBFS on arm64 in the test rebuild

Revision history for this message

Graham Inggs (ginggs) wrote on 2023-09-20:

#11

Also gauche-c-wrapper, rocm-hipamd and stdgpu-contrib

Revision history for this message

Heinrich Schuchardt (xypron) wrote on 2023-09-21:

#12

cbmc fails to build from source on arm64 with LTO disabled as reported in LP 2036745:

Failed test: fmod1
CBMC version 5.89.0 (cbmc-5.89.0) 64-bit arm64 linux
Parsing main.c
file /usr/include/aarch64-linux-gnu/bits/math-vector.h line 30: syntax error before '__f32x4_t'
PARSING ERROR

https://launchpadlibrarian.net/688275364/buildlog_ubuntu-mantic-arm64.cbmc_5.89.0-2ubuntu1~ppa1_BUILDING.txt.gz

Revision history for this message

In Sourceware.org Bugzilla #30909, Simon Chopin (schopin) wrote on 2023-09-27:

#14

The use of vector types such as __Float32x4_t in the aarch64 math-vector.h header breaks quite a few programs that are essentially parsing C code but using GCC as their preprocessor. GCC expands to the paths using its own intrinsic types, which aren't implemented by the consuming programs.

I'm not sure if this qualifies as a bug in glibc, as it seems reasonable to rely on those types, but we've seen this happen in quite a few instances in Ubuntu:

https://bugs.launchpad.net/ubuntu/+source/mumax3/+bug/2032624

Revision history for this message

Simon Chopin (schopin) wrote on 2023-09-27 (last edit on 2023-09-27):

#13

Reported upstream at https://sourceware.org/bugzilla/show_bug.cgi?id=30909

Bug Watch Updater (bug-watch-updater) on 2023-09-27

Changed in glibc:
importance:	Unknown → Medium
status:	Unknown → New

Revision history for this message

In Sourceware.org Bugzilla #30909, Simon Chopin (schopin) wrote on 2023-09-27:

#16

I posted a tentative patch adding a way to work around those types at https://sourceware.org/pipermail/libc-alpha/2023-September/151770.html

I'll ship it in my next Ubuntu upload for Mantic as a way to unblock us due to our fairly tight schedule, but I'm hoping we can come up with a better long-term solution.

Revision history for this message

Simon Chopin (schopin) wrote on 2023-09-27:

#15

I'll be shipping a temporary workaround patch that disables the vec types if __ARM_VEC_MATH_DISABLED is defined. We still need to patch each failure individually to add that flag to the preprocessor step (not at build time but at runtime!), but at least the patching should be easier and quicker than providing proper support for the various vector types.

We shouldn't bother upstreaming those fixes to Debian, as I'm pretty sure the final glibc part of the solution will look fairly different than my current patch, but at least we can get those packages working in the mean time.

Revision history for this message

Launchpad Janitor (janitor) wrote on 2023-09-30:

#17

This bug was fixed in the package glibc - 2.38-1ubuntu5

---------------
glibc (2.38-1ubuntu5) mantic; urgency=medium

  * Update from upstream release branche:
    - CVE-2023-4527: Stack read overflow with large TCP responses in
      no-aaaa mode
    - CVE-2023-4806: use after free in getcanonname
    - LP: #2031909: Fix oversized __io_vtables
  * d/p/u/0001-Fix-leak-in-getaddrinfo-introduced-by-the-fix-for-CV:
    Cherry-picked to fix a regression in one of the previous CVE fixes
    (LP: #2037516, CVE-2023-5156)
  * d/p/lp2032624.patch: add an escape hatch in arm64 math-vector.h.
    This should help fixing multiple FTBFS (LP: #2032624)

-- Simon Chopin <email address hidden> Wed, 27 Sep 2023 16:38:18 +0200

Changed in glibc (Ubuntu):
status:	New → Fix Released

Revision history for this message

Simon Chopin (schopin) wrote on 2023-10-05:

#18

Reopening in glibc as I had some upstream feedback that basically mean my workaround is not a good idea. I agree with them, and thus we should drop it, both in upcoming releases but also in the upcoming Mantic SRU to avoid users starting to depend on it, however unlikely that would be.

Changed in glibc (Ubuntu):
importance:	Critical → Medium
status:	Fix Released → In Progress

Revision history for this message

In Sourceware.org Bugzilla #30909, Connor-baker (connor-baker) wrote on 2023-11-02:

#19

Adding some additional context:

We're running into this issue in Nixpkgs: https://github.com/NixOS/nixpkgs/pull/264599#pullrequestreview-1707381631.

The GLIBC 2.38 update introduces intrinsics for `aarch64-linux` in `math.h`.

NVCC (NVIDIA's CUDA Compiler) declares itself to be the same compiler as its host compiler. This causes inclusion of unsupported `aarch64-linux` intrinsics. NVCC is now unable to compile any CUDA file for `aarch64-linux` because it does not support these intrinsics: https://forums.developer.nvidia.com/t/nvcc-fails-to-build-with-arm-neon-instructions-cpp-vs-cu/248355/2.

I'll be submitting the same patch I've made for Nixpkgs.

Revision history for this message

Launchpad Janitor (janitor) wrote on 2023-11-18:

#20

This bug was fixed in the package glibc - 2.38-3ubuntu1

---------------
glibc (2.38-3ubuntu1) noble; urgency=medium

  * debian/patches/git-updates.diff: update from upstream stable branch
    Dropped changes, superseded by the upstream git updates:
    - debian/patches/CVE-2023-4911.patch: terminate immediately if end of
      input is reached in elf/dl-tunables.c.
    - d/p/u/0001-Fix-leak-in-getaddrinfo-introduced-by-the-fix-for-CV:
      Cherry-picked to fix a regression in one of the previous CVE fixes
  * Merge 2.38-3 from Debian experimental
    Dropped changes, included in Debian:
    - debian/patches/hurd-i386/git-powerpc-longjmp.diff: Fix build after chk
      hidden builtin fix.
  * Drop d/p/lp2032624.patch as advised by upstream.
    Downstream users will have to actually implement those types or stop
    pretending they're GCC. (LP: #2032624)
  * d/p/lp2031495.patch: fix test suite on armhf for -prof variant
    (LP: #2031495)
  * d/control.in/i386: fix math-vector-fortran.h file move (LP: #2039234)

-- Simon Chopin <email address hidden> Mon, 23 Oct 2023 18:54:07 +0200

Changed in glibc (Ubuntu):
status:	In Progress → Fix Released

Revision history for this message

Cory Bloor (slavik81) wrote on 2024-04-10:

#21

All HIP language libraries have been FTBFS on arm64 since the vector types were added, so this issue has been blocking the libraries from syncing for several months. The only ones that have been able to update have been the ones that had always been broken on arm64 for other reasons.

I've opened a merge request for the glibc package that fixes the issue for rocm-hipamd, using the following patch:

```
--- glibc.orig/sysdeps/aarch64/fpu/bits/math-vector.h
+++ glibc/sysdeps/aarch64/fpu/bits/math-vector.h
@@ -101,7 +101,8 @@ typedef __attribute__ ((__neon_vector_ty
typedef __attribute__ ((__neon_vector_type__ (2))) double __f64x2_t;
#endif

-#if __GNUC_PREREQ(10, 0) || __glibc_clang_prereq(11, 0)
+#if (__GNUC_PREREQ(10, 0) || __glibc_clang_prereq(11, 0)) \
+ && !defined(__HIP_DEVICE_COMPILE__)
# define __SVE_VEC_MATH_SUPPORTED
typedef __SVFloat32_t __sv_f32_t;
typedef __SVFloat64_t __sv_f64_t;
```

I think the only real alternative would be to remove the arm64 build of rocm-hipamd from the archive. The existing ROCm libraries in noble all FTBFS on arm64, but the versions that successfully built on arm64 with older copies of glibc are blocking transitions from proposed to release.

Revision history for this message

Simon Chopin (schopin) wrote on 2024-04-12:

#22

I'm much more OK with removing the binary than uploading that patch to glibc.

System headers are just that: headers that reflect the system they're installed in. The fact that you can sometimes get away with using system headers when cross-compiling to a different environment is just an accident, not a feature. Your compiling environment should be providing its own *complete* set of headers that matches the target environment.

Revision history for this message

Simon Chopin (schopin) wrote on 2024-04-12:

#23

(and of course not include the system headers in its include paths)

Revision history for this message

Graham Inggs (ginggs) wrote on 2024-04-23:

#24

rocm-hipamd's arm64 binaries were removed in LP: #2061048

Changed in rocm-hipamd (Ubuntu):
status:	New → Fix Released
Changed in glibc (Ubuntu):
status:	Fix Released → Won't Fix

Graham Inggs (ginggs) on 2024-04-23

Changed in mumax3 (Ubuntu):
status:	New → Fix Released

Revision history for this message

Graham Inggs (ginggs) wrote on 2024-04-23:

#25

aspectc++, cbmc and nvidia-nccl have new versions in noble-proposed which are unable to migrate due to missing builds on arm64.

Please removed the following binaries from noble:

The only reverse-dependency is gloo-cuda, which has never built on arm64.

Revision history for this message

Łukasz Zemczak (sil2100) wrote on 2024-04-23:

#26

$ remove-package -m "Remove the arm64 binaries to unblock new proposed packages (LP: #2032624)" -s noble -a arm64 -b aspectc++ libpuma-dev cbmc libnccl-dev libnccl2
Removing packages from noble:
aspectc++ 1:2.3+git20230726-1 in noble arm64
libpuma-dev 1:2.3+git20230726-1 in noble arm64
cbmc 5.12-5 in noble arm64
libnccl-dev 2.18.3-1-1 in noble arm64
libnccl2 2.18.3-1-1 in noble arm64
Comment: Remove the arm64 binaries to unblock new proposed packages (LP: #2032624)
Remove [y|N]? y
5 packages successfully removed.

Graham Inggs (ginggs) on 2024-04-23

Changed in cbmc (Ubuntu):
status:	New → Fix Released
affects:	aspectc++ (Ubuntu) → ubuntu
Changed in ubuntu:
status:	New → Fix Released
Changed in nvidia-nccl (Ubuntu):
status:	New → Fix Released

Bug Watch Updater (bug-watch-updater) on 2024-05-05

Changed in cbmc (Debian):
status:	Unknown → New
Changed in aspectc++ (Debian):
status:	Unknown → New
Changed in cxref (Debian):
status:	Unknown → New
Changed in rocm-hipamd (Debian):
status:	Unknown → New

Revision history for this message

Graham Inggs (ginggs) wrote on 2024-05-05:

#27

pyvkfft seems fixed in 2024.1.2+ds1-1

Changed in pyvkfft (Ubuntu):
status:	New → Fix Released

Bug Watch Updater (bug-watch-updater) on 2024-05-12

Changed in cbmc (Debian):
status:	New → Confirmed

Bug Watch Updater (bug-watch-updater) on 2024-05-19

Changed in cxref (Debian):
status:	New → Fix Released

Revision history for this message

Graham Inggs (ginggs) wrote on 2024-05-19:

#28

cxref was fixed in version 1.6e-9

Changed in cxref (Ubuntu):
status:	New → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

debbugs #1070441
[forwarded serious ftbfs] Edit
debbugs #1070443
[open serious ftbfs] Edit
debbugs #1070444
[done important ftbfs sid trixie] Edit
debbugs #1070446
[open important ftbfs] Edit
sourceware-bugs #30909
[UNCONFIRMED] Edit
auto-github-insightsoftwareconsortium-itk #1959
[closed type:Bug] Edit

Bug watches keep track of this bug in other bug trackers.

Ubuntumumax3 package

mumax3 test suite fails against glibc 2.38

Bug Description

Related branches

CVE References

Other bug subscribers

Remote bug watches

Ubuntu
mumax3 package