dgesdd gets stuck for size 40+

Bug #1875181 reported by Dirk Toewe
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
lapack (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

On Ubuntu 20.04 (Focal Fossa), the following code:

https://pastebin.com/A0bTqAAJ

gets stuck on the LAPACKE_dgesdd call using liblapacke-dev (3.9.0-1build1). It seems to get stuck for every matrix size N>=40. I've unsuccessfully tried out the following variations:

  * Different compilers (gcc-9, gcc-10, clang-10)
  * Linking to different BLAS implementations (libopenblas-dev, libblas-dev)
  * With and without Different optimization settings (-ffast-math, -march=native, -O3)

The CPU is an AMD Ryzen9 3900X (in case that's relevant).

What I've then tried is to build LAPACK(E) 3.9.0 from source directly using cmake and the "make.inc.example" file included in the LAPACK source code. And that compiled version works just fine!

Maybe there is bad optimization done by the compiler used for the Ubuntu package?

My C/C++ (let alone Fortran) knowledge is quite limited, which means I am going to need some help if I am to provide further debug information.

Revision history for this message
Dirk Toewe (dtuo) wrote :

Here's the GDB stack trace:

#0 0x00007ffff3aec70b in sched_yield () at ../sysdeps/unix/syscall-template.S:78
#1 0x00007ffff4d749a5 in exec_blas_async_wait () from /lib/x86_64-linux-gnu/liblapack.so.3
#2 0x00007ffff4d74a7c in exec_blas () from /lib/x86_64-linux-gnu/liblapack.so.3
#3 0x00007ffff4c42392 in dtrmv_thread_NUN () from /lib/x86_64-linux-gnu/liblapack.so.3
#4 0x00007ffff4c19c7a in dtrmv_ () from /lib/x86_64-linux-gnu/liblapack.so.3
#5 0x00007ffff486e74b in dlarft_ () from /lib/x86_64-linux-gnu/liblapack.so.3
#6 0x00007ffff48aa758 in dormqr_ () from /lib/x86_64-linux-gnu/liblapack.so.3
#7 0x00007ffff48a83a6 in dormbr_ () from /lib/x86_64-linux-gnu/liblapack.so.3
#8 0x00007ffff47ff9a6 in dgesdd_ () from /lib/x86_64-linux-gnu/liblapack.so.3
#9 0x00007ffff43eb179 in LAPACKE_dgesdd_work () from /lib/x86_64-linux-gnu/liblapacke.so.3
#10 0x00007ffff43eac28 in LAPACKE_dgesdd () from /lib/x86_64-linux-gnu/liblapacke.so.3
#11 0x00000000004c78e5 in main () at src/lapacke_error.cpp:40

Revision history for this message
Christophe (chatelai) wrote :

I experienced the same problem with the routine LAPACKE_dsyevd with a 6x6 matrix. I tried to check the eigenvalues with GNU/Octave and .. it got stuck as well! The code was working a few days ago with Ubuntu 19.10 before the upgrade to 20.04.

The backtrace is:
#0 0x00007ffff599f70b in sched_yield () at ../sysdeps/unix/syscall-template.S:78
#1 0x00007ffff614cfa5 in exec_blas_async_wait () from /usr/lib/x86_64-linux-gnu/libopenblas.so.0
#2 0x00007ffff614d07c in exec_blas () from /usr/lib/x86_64-linux-gnu/libopenblas.so.0
#3 0x00007ffff5f693cc in dsymv_thread_U () from /usr/lib/x86_64-linux-gnu/libopenblas.so.0
#4 0x00007ffff5f251ae in dsymv_ () from /usr/lib/x86_64-linux-gnu/libopenblas.so.0
#5 0x00007ffff7ad09f5 in dsytd2_ () from /usr/lib/x86_64-linux-gnu/libopenblas.so.0
#6 0x00007ffff7ad536d in dsytrd_ () from /usr/lib/x86_64-linux-gnu/libopenblas.so.0
#7 0x00007ffff7ac66cc in dsyevd_ () from /usr/lib/x86_64-linux-gnu/libopenblas.so.0
#8 0x00007ffff5cd76e0 in LAPACKE_dsyevd_work () from /usr/lib/x86_64-linux-gnu/liblapacke.so.3
#9 0x00007ffff5cd70b7 in LAPACKE_dsyevd () from /usr/lib/x86_64-linux-gnu/liblapacke.so.3

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in lapack (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.