libscalapack-mpi shared libraries incorrectly include thread/shared-memory support

Bug #816206 reported by Anthony Costa
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
scalapack (Ubuntu)
New
Undecided
Unassigned

Bug Description

libscalapack-mpi1 shared libraries differ significantly in 11.04 (natty) from 10.10 (maverick). linking against libraries in natty and running and ltrace -S reveals that many calls to SYS_clone are made directly following calls to scalapack functions. this is untrue in the same libraries in the maverick repositories.

from ltrace -S in maverick linking against libscalapack-openmpi.so:

pdgeqrf_(0x7fffa5d59a9c, 0x7fffa5d59a9c, 0x12f0f40, 0x6234b0, 0x6234b0) = 0x7f1d6020b604
memcpy(0x013206d0, "D\245\t\222+\006\360?\264\013b{P\037\304?\035&B\330i|\257?U6\251\355.(\236?"..., 64800) = 0x013206d0
pdorgqr_(0x7fffa5d59a9c, 0x7fffa5d59a9c, 0x7fffa5d59a9c, 0x12f0f40, 0x6234b0) = 32

[...]

where as from ltrace -S in natty linking against libscalapack-openmpi.so:

pdgeqrf_(0x7fff2b767c9c, 0x7fff2b767c9c, 0xe539e0, 0x61f4b0, 0x61f4b0 <unfinished ...>
SYS_clone(0x3d0f00, 0x7fd0457effd0, 0x7fd0457f09d0, 0x7fd0457f09d0, 0x7fd0457f0700) = 22327
SYS_clone(0x3d0f00, 0x7fd045ff0fd0, 0x7fd045ff19d0, 0x7fd045ff19d0, 0x7fd045ff1700) = 22328
SYS_futex(0x7fd0457f09d0, 0, 22327, 0, 0) = -11
SYS_clone(0x3d0f00, 0x7fd045ff0fd0, 0x7fd045ff19d0, 0x7fd045ff19d0, 0x7fd045ff1700) = 22329
SYS_clone(0x3d0f00, 0x7fd0457effd0, 0x7fd0457f09d0, 0x7fd0457f09d0, 0x7fd0457f0700) = 22330
SYS_futex(0x7fd045ff19d0, 0, 22329, 0, 0) = 0
SYS_clone(0x3d0f00, 0x7fd0457effd0, 0x7fd0457f09d0, 0x7fd0457f09d0, 0x7fd0457f0700) = 22331

[....]

memcpy(0x00e83170, "\207\nD\354\312", 64800) = 0x00e83170
pdorgqr_(0x7fff2b767c9c, 0x7fff2b767c9c, 0x7fff2b767c9c, 0xe539e0, 0x61f4b0 <unfinished ...>
SYS_clone(0x3d0f00, 0x7fd0457effd0, 0x7fd0457f09d0, 0x7fd0457f09d0, 0x7fd0457f0700) = 22391
SYS_clone(0x3d0f00, 0x7fd045ff0fd0, 0x7fd045ff19d0, 0x7fd045ff19d0, 0x7fd045ff1700) = 22392
SYS_futex(0x7fd045ff19d0, 0, 22392, 0, 0) = 0
SYS_clone(0x3d0f00, 0x7fd045ff0fd0, 0x7fd045ff19d0, 0x7fd045ff19d0, 0x7fd045ff1700) = 22393
SYS_clone(0x3d0f00, 0x7fd0457effd0, 0x7fd0457f09d0, 0x7fd0457f09d0, 0x7fd0457f0700) = 22394
SYS_futex(0x7fd045ff19d0, 0, 22393, 0, 0) = -11
SYS_clone(0x3d0f00, 0x7fd0457effd0, 0x7fd0457f09d0, 0x7fd0457f09d0, 0x7fd0457f0700) = 22395

[...]

MPI scalapack should *not* be calling SYS_clone. this seriously interferes with the explicit parallelization of most users MPI codes, causing serious performance degradation, especially on clusters where all processors are taken by MPI tasks. it suggests that the MPI versions of the scalapack shared libraries in natty were incorrectly built with thread/shared-memory support.

a check of this was done by disabling calls to scalapack functions in my own MPI code, which eliminated the problem entirely.

description: updated
description: updated
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.