Ubuntu

Regression: Crash when called from Rmpi

Reported by Steffen Neumann on 2008-04-01
This bug report is a duplicate of:  Bug #234837: Binutils corrupts Open MPI . Edit Remove
4
Affects Status Importance Assigned to Milestone
openmpi (Ubuntu)
Undecided
Unassigned

Bug Description

Hi,

we have three machines at different hardy patchlevels,
one with openmpi1-1.2.5-1 (around 2008-03-17)
and two current with 1.2.5-1ubuntu1. All are AMD64 architecture.

When initializing Rmpi I get a

library(Rmpi)
 *** caught segfault ***

which actually happens in

Rmpi.c:73 MPI_Init((void *)0,(void *)0);

Program received signal SIGSEGV, Segmentation fault.
0x00007f7e9ffb5b8b in _int_malloc () from /usr/lib/libopen-pal.so.0

#0 0x00007f7e9ffb5b8b in _int_malloc () from /usr/lib/libopen-pal.so.0
#1 0x00007f7e9ffb6e58 in malloc () from /usr/lib/libopen-pal.so.0
#2 0x00007f7e9ff98bfb in opal_class_initialize () from /usr/lib/libopen-pal.so.0
#3 0x00007f7e9fface2b in opal_malloc_init () from /usr/lib/libopen-pal.so.0
#4 0x00007f7e9ff99d97 in opal_init_util () from /usr/lib/libopen-pal.so.0
#5 0x00007f7e9ff99e76 in opal_init () from /usr/lib/libopen-pal.so.0
#6 0x00007f7ea0889723 in ompi_mpi_init () from /usr/lib/libmpi.so.0
#7 0x00007f7ea08ab15f in PMPI_Init () from /usr/lib/libmpi.so.0
#8 0x00007f7ea0aef866 in mpi_initialize () at Rmpi.c:73

a simple "mpirun -n 4 date" works fine on all machines.

Although all three hachines have completely different libc-versions
(between 2.7-5ubuntu2 and 2.7-9ubuntu2) it is sufficient to copy the
/usr/lib/openmpi/lib from the machine with openmpi1-1.2.5-1
to the two other machines. Rebuilding 1.2.5-1ubuntu1 on the
newer machines doesn't help.

Since the only changes in ubuntu-1 are fixed dangling pointers
and a maintainer field, I suspect it has to do with the build process.

Yours,
Steffen

Andreas Klöckner (inform) wrote :

I'm seeing this, too, on released hardy. (1.2.5-1ubuntu1). Here's my stack:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f51cbec26e0 (LWP 27918)]
0x00007f51c932fb8b in _int_malloc () from /usr/lib/libopen-pal.so.0
(gdb) ]bt
Undefined command: "". Try "help".
(gdb) bt
#0 0x00007f51c932fb8b in _int_malloc () from /usr/lib/libopen-pal.so.0
#1 0x00007f51c9330e58 in malloc () from /usr/lib/libopen-pal.so.0
#2 0x00007f51c9312bfb in opal_class_initialize () from /usr/lib/libopen-pal.so.0
#3 0x00007f51c9326e2b in opal_malloc_init () from /usr/lib/libopen-pal.so.0
#4 0x00007f51c9313d97 in opal_init_util () from /usr/lib/libopen-pal.so.0
#5 0x00007f51c9313e76 in opal_init () from /usr/lib/libopen-pal.so.0
#6 0x00007f51c99d7723 in ompi_mpi_init () from /usr/lib/libmpi.so.0
#7 0x00007f51c99f90d6 in PMPI_Init () from /usr/lib/libmpi.so.0
#8 0x00007f51ca11f5cb in boost::mpi::environment::environment () from /home/andreas/pool/lib/libboost_mpi-gcc42-mt-1_35.so.1.35.0
#9 0x00007f51ca5b80a2 in boost::mpi::python::mpi_init () from /home/andreas/pool/lib/python2.5/site-packages/boost/mpi.so
#10 0x00007f51ca5b8607 in boost::mpi::python::export_environment () from /home/andreas/pool/lib/python2.5/site-packages/boost/mpi.so
#11 0x00007f51ca5bdb8c in boost::mpi::python::init_module_mpi () from /home/andreas/pool/lib/python2.5/site-packages/boost/mpi.so
#12 0x00007f51c9c6abee in boost::function0<void, std::allocator<boost::function_base> >::operator() ()
   from /home/andreas/pool/lib/libboost_python-gcc42-mt-1_35.so.1.35.0
#13 0x00007f51c9c6a998 in boost::python::handle_exception_impl () from /home/andreas/pool/lib/libboost_python-gcc42-mt-1_35.so.1.35.0
#14 0x00007f51c9c6b265 in boost::python::handle_exception<void (*)()> () from /home/andreas/pool/lib/libboost_python-gcc42-mt-1_35.so.1.35.0
#15 0x00007f51c9c6af46 in boost::python::detail::init_module () from /home/andreas/pool/lib/libboost_python-gcc42-mt-1_35.so.1.35.0
#16 0x00000000004a39c3 in _PyImport_LoadDynamicModule ()
#17 0x00000000004a1809 in ?? ()
#18 0x00000000004a1cdb in ?? ()
#19 0x00000000004a1f1a in ?? ()
#20 0x00000000004a23c5 in PyImport_ImportModuleLevel ()
#21 0x0000000000481a19 in ?? ()
#22 0x0000000000417e73 in PyObject_Call ()
#23 0x0000000000481fc2 in PyEval_CallObjectWithKeywords ()
#24 0x0000000000485b61 in PyEval_EvalFrameEx ()
#25 0x000000000048a376 in PyEval_EvalCodeEx ()
#26 0x000000000048a492 in PyEval_EvalCode ()
#27 0x00000000004a0a00 in PyImport_ExecCodeModuleEx ()
#28 0x00000000004a1230 in ?? ()
#29 0x00000000004a28c3 in ?? ()
#30 0x00000000004a1809 in ?? ()
#31 0x00000000004a1cdb in ?? ()
#32 0x00000000004a1f1a in ?? ()
#33 0x00000000004a23c5 in PyImport_ImportModuleLevel ()
#34 0x0000000000481a19 in ?? ()
#35 0x0000000000417e73 in PyObject_Call ()
#36 0x0000000000481fc2 in PyEval_CallObjectWithKeywords ()
#37 0x0000000000485b61 in PyEval_EvalFrameEx ()
#38 0x000000000048a376 in PyEval_EvalCodeEx ()
#39 0x000000000048a492 in PyEval_EvalCode ()
#40 0x00000000004ac459 in PyRun_InteractiveOneFlags ()
#41 0x00000000004ac664 in PyRun_InteractiveLoopFlags ()
#42 0x00000000004ac76a in PyRun_AnyFileExFlags ()

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers