Comment 0 for bug 2007993

Revision history for this message
Cory Bloor (slavik81) wrote :

# System Information:
Description: Ubuntu 22.04.2 LTS
Release: 22.04

# Package Version:
libhsa-runtime64-1:
  Installed: 5.0.0-1
  Source: rocr-runtime

# What was done:

    # on Ubuntu 22.04 or 22.10 with an AMD GPU installed
    apt install rocminfo kmod
    rocminfo

# What was seen:

    ROCk module is loaded
    Segmentation fault (core dumped)

Note that the rocminfo utility will not try to initialize libhsa-runtime64 unless you have an AMD GPU installed, which is required to reproduce this problem.

After some debugging, I came to the conclusion that this is a null pointer dereference in libhsa-runtime64. The order of static initialization is different when building the rocr-runtime package on Ubuntu as compared to on Debian, and this results in the package working on Debian but crashing when it's rebuilt for Ubuntu. A couple of static variables are being copied before they are initialized, leading to a null pointer dereference later on in the program.

# What was expected:
rocminfo should not crash

# Debian Bug:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1031089

# Debian Patch:
https://salsa.debian.org/rocm-team/rocr-runtime/-/blob/debian/5.2.3-3/debian/patches/0003-fix-static-initialization-order.patch

The patch applied to the Debian package has fixed this bug in Ubuntu 23.04. It would be great if the fix could also be applied to Ubuntu 22.04 LTS. There's not a lot of ROCm functionality in Jammy, but fixing this bug would at least get the basics like rocminfo working.