Regression: Crash when called from Rmpi

Bug #210273 reported by Steffen Neumann
This bug report is a duplicate of:  Bug #234837: Binutils corrupts Open MPI . Edit Remove
4
Affects Status Importance Assigned to Milestone
openmpi (Ubuntu)
New
Undecided
Unassigned

Bug Description

Hi,

we have three machines at different hardy patchlevels,
one with openmpi1-1.2.5-1 (around 2008-03-17)
and two current with 1.2.5-1ubuntu1. All are AMD64 architecture.

When initializing Rmpi I get a

library(Rmpi)
 *** caught segfault ***

which actually happens in

Rmpi.c:73 MPI_Init((void *)0,(void *)0);

Program received signal SIGSEGV, Segmentation fault.
0x00007f7e9ffb5b8b in _int_malloc () from /usr/lib/libopen-pal.so.0

#0 0x00007f7e9ffb5b8b in _int_malloc () from /usr/lib/libopen-pal.so.0
#1 0x00007f7e9ffb6e58 in malloc () from /usr/lib/libopen-pal.so.0
#2 0x00007f7e9ff98bfb in opal_class_initialize () from /usr/lib/libopen-pal.so.0
#3 0x00007f7e9fface2b in opal_malloc_init () from /usr/lib/libopen-pal.so.0
#4 0x00007f7e9ff99d97 in opal_init_util () from /usr/lib/libopen-pal.so.0
#5 0x00007f7e9ff99e76 in opal_init () from /usr/lib/libopen-pal.so.0
#6 0x00007f7ea0889723 in ompi_mpi_init () from /usr/lib/libmpi.so.0
#7 0x00007f7ea08ab15f in PMPI_Init () from /usr/lib/libmpi.so.0
#8 0x00007f7ea0aef866 in mpi_initialize () at Rmpi.c:73

a simple "mpirun -n 4 date" works fine on all machines.

Although all three hachines have completely different libc-versions
(between 2.7-5ubuntu2 and 2.7-9ubuntu2) it is sufficient to copy the
/usr/lib/openmpi/lib from the machine with openmpi1-1.2.5-1
to the two other machines. Rebuilding 1.2.5-1ubuntu1 on the
newer machines doesn't help.

Since the only changes in ubuntu-1 are fixed dangling pointers
and a maintainer field, I suspect it has to do with the build process.

Yours,
Steffen

Revision history for this message
Andreas Klöckner (inform) wrote :

I'm seeing this, too, on released hardy. (1.2.5-1ubuntu1). Here's my stack:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f51cbec26e0 (LWP 27918)]
0x00007f51c932fb8b in _int_malloc () from /usr/lib/libopen-pal.so.0
(gdb) ]bt
Undefined command: "". Try "help".
(gdb) bt
#0 0x00007f51c932fb8b in _int_malloc () from /usr/lib/libopen-pal.so.0
#1 0x00007f51c9330e58 in malloc () from /usr/lib/libopen-pal.so.0
#2 0x00007f51c9312bfb in opal_class_initialize () from /usr/lib/libopen-pal.so.0
#3 0x00007f51c9326e2b in opal_malloc_init () from /usr/lib/libopen-pal.so.0
#4 0x00007f51c9313d97 in opal_init_util () from /usr/lib/libopen-pal.so.0
#5 0x00007f51c9313e76 in opal_init () from /usr/lib/libopen-pal.so.0
#6 0x00007f51c99d7723 in ompi_mpi_init () from /usr/lib/libmpi.so.0
#7 0x00007f51c99f90d6 in PMPI_Init () from /usr/lib/libmpi.so.0
#8 0x00007f51ca11f5cb in boost::mpi::environment::environment () from /home/andreas/pool/lib/libboost_mpi-gcc42-mt-1_35.so.1.35.0
#9 0x00007f51ca5b80a2 in boost::mpi::python::mpi_init () from /home/andreas/pool/lib/python2.5/site-packages/boost/mpi.so
#10 0x00007f51ca5b8607 in boost::mpi::python::export_environment () from /home/andreas/pool/lib/python2.5/site-packages/boost/mpi.so
#11 0x00007f51ca5bdb8c in boost::mpi::python::init_module_mpi () from /home/andreas/pool/lib/python2.5/site-packages/boost/mpi.so
#12 0x00007f51c9c6abee in boost::function0<void, std::allocator<boost::function_base> >::operator() ()
   from /home/andreas/pool/lib/libboost_python-gcc42-mt-1_35.so.1.35.0
#13 0x00007f51c9c6a998 in boost::python::handle_exception_impl () from /home/andreas/pool/lib/libboost_python-gcc42-mt-1_35.so.1.35.0
#14 0x00007f51c9c6b265 in boost::python::handle_exception<void (*)()> () from /home/andreas/pool/lib/libboost_python-gcc42-mt-1_35.so.1.35.0
#15 0x00007f51c9c6af46 in boost::python::detail::init_module () from /home/andreas/pool/lib/libboost_python-gcc42-mt-1_35.so.1.35.0
#16 0x00000000004a39c3 in _PyImport_LoadDynamicModule ()
#17 0x00000000004a1809 in ?? ()
#18 0x00000000004a1cdb in ?? ()
#19 0x00000000004a1f1a in ?? ()
#20 0x00000000004a23c5 in PyImport_ImportModuleLevel ()
#21 0x0000000000481a19 in ?? ()
#22 0x0000000000417e73 in PyObject_Call ()
#23 0x0000000000481fc2 in PyEval_CallObjectWithKeywords ()
#24 0x0000000000485b61 in PyEval_EvalFrameEx ()
#25 0x000000000048a376 in PyEval_EvalCodeEx ()
#26 0x000000000048a492 in PyEval_EvalCode ()
#27 0x00000000004a0a00 in PyImport_ExecCodeModuleEx ()
#28 0x00000000004a1230 in ?? ()
#29 0x00000000004a28c3 in ?? ()
#30 0x00000000004a1809 in ?? ()
#31 0x00000000004a1cdb in ?? ()
#32 0x00000000004a1f1a in ?? ()
#33 0x00000000004a23c5 in PyImport_ImportModuleLevel ()
#34 0x0000000000481a19 in ?? ()
#35 0x0000000000417e73 in PyObject_Call ()
#36 0x0000000000481fc2 in PyEval_CallObjectWithKeywords ()
#37 0x0000000000485b61 in PyEval_EvalFrameEx ()
#38 0x000000000048a376 in PyEval_EvalCodeEx ()
#39 0x000000000048a492 in PyEval_EvalCode ()
#40 0x00000000004ac459 in PyRun_InteractiveOneFlags ()
#41 0x00000000004ac664 in PyRun_InteractiveLoopFlags ()
#42 0x00000000004ac76a in PyRun_AnyFileExFlags ()

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.