Symbol in .so file cannot be resolve by dlopen on Mac OSX

Bug #1005329 reported by Erik Schnetter
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
pocl
Invalid
Medium
Carlos Sánchez de La Lama

Bug Description

While building pocl on Mac OSX, I receive the error:

dyld: lazy symbol binding failed: Symbol not found: __ZN4llvm2cl3optISsLb0ENS0_6parserISsEEE4doneEv
  Referenced from: /Users/eschnett/src/pocl/lib/llvmopencl/.libs/llvmopencl.so
  Expected in: flat namespace

This leads to a segfault and a core dump. The build continues, but when I "make check" afterwards, I see the same kinds of errors.

The complete error message is:

Making all in standalone
../../scripts/pocl-standalone -h standalone.h -o standalone.bc standalone.cl
dyld: lazy symbol binding failed: Symbol not found: __ZN4llvm2cl3optISsLb0ENS0_6parserISsEEE4doneEv
  Referenced from: /Users/eschnett/src/pocl/lib/llvmopencl/.libs/llvmopencl.so
  Expected in: flat namespace

dyld: Symbol not found: __ZN4llvm2cl3optISsLb0ENS0_6parserISsEEE4doneEv
  Referenced from: /Users/eschnett/src/pocl/lib/llvmopencl/.libs/llvmopencl.so
  Expected in: flat namespace

0 libLLVM-3.1.dylib 0x0000000109a2d872 _ZL15PrintStackTracePv + 34
1 libLLVM-3.1.dylib 0x0000000109a2dd89 _ZL13SignalHandleri + 697
2 libsystem_c.dylib 0x00007fff8b042cfa _sigtramp + 26
3 libsystem_c.dylib 0x00007fff68c257a1 _sigtramp + 18446744073134811841
4 libsystem_c.dylib 0x00007fff68c23948 _sigtramp + 18446744073134804072
5 libdyld.dylib 0x00007fff909df716 dyld_stub_binder_ + 13
6 llvmopencl.so 0x000000010a7431c0 _ZN4llvm9AttributeL8NoInlineE + 136184
7 llvmopencl.so 0x000000010a6e139d llvm::cl::opt<std::string, false, llvm::cl::parser<std::string> >::opt<char [7], llvm::cl::desc, llvm::cl::value_desc>(char const (&) [7], llvm::cl::desc const&, llvm::cl::value_desc const&) + 45
8 llvmopencl.so 0x000000010a6e12c3 __cxx_global_var_init + 67
9 llvmopencl.so 0x000000010a6e1369 global constructors keyed to a + 9
10 llvmopencl.so 0x00007fff68c2eda6 global constructors keyed to a + 140730481039942
11 llvmopencl.so 0x00007fff68c2eaf2 global constructors keyed to a + 140730481039250
12 llvmopencl.so 0x00007fff68c2c2e4 global constructors keyed to a + 140730481028996
13 llvmopencl.so 0x00007fff68c2d0b7 global constructors keyed to a + 140730481032535
14 llvmopencl.so 0x00007fff68c221b9 global constructors keyed to a + 140730480987737
15 llvmopencl.so 0x00007fff68c28657 global constructors keyed to a + 140730481013495
16 libdyld.dylib 0x00007fff909df95b dlopen + 57
17 libLLVM-3.1.dylib 0x0000000109a194ec llvm::sys::DynamicLibrary::getPermanentLibrary(char const*, std::string*) + 188
18 libLLVM-3.1.dylib 0x0000000109a29bdf llvm::PluginLoader::operator=(std::string const&) + 175
19 opt 0x000000010902910a llvm::cl::opt<llvm::PluginLoader, false, llvm::cl::parser<std::string> >::handleOccurrence(unsigned int, llvm::StringRef, llvm::StringRef) + 138
20 libLLVM-3.1.dylib 0x0000000109a0616c _ZL28CommaSeparateAndAddOccurencePN4llvm2cl6OptionEjNS_9StringRefES3_b + 652
21 libLLVM-3.1.dylib 0x0000000109a022df _ZL13ProvideOptionPN4llvm2cl6OptionENS_9StringRefES3_iPKPKcRi + 239
22 libLLVM-3.1.dylib 0x0000000109a0055a llvm::cl::ParseCommandLineOptions(int, char const* const*, char const*, bool) + 3850
23 opt 0x0000000109024d50 main + 208
24 opt 0x00000001090206e4 start + 52
Stack dump:
0. Program arguments: /opt/local/libexec/llvm-3.1/bin/opt -load=/Users/eschnett/src/pocl/lib/llvmopencl/.libs/llvmopencl.so -generate-header -disable-output -header=standalone.h ./.pocl34151/kernel.bc
../../scripts/pocl-standalone: line 81: 34157 Segmentation fault: 11 /opt/local/bin/opt -load=/Users/eschnett/src/pocl/lib/llvmopencl/.libs/llvmopencl.so -generate-header -disable-output -header=${header} ${kernel_bc}

I configured with

$ env CC=clang CXX=clang++ CLANG=clang ./configure --enable-debug --prefix=/Users/eschnett/pocl --disable-icd

I am surprised to see a .so file -- on a Mac, these should be .dylib files instead. Could it be that the .so file is produced in a non-standard manner (e.g. not via libtool), so that some information necessary on a Mac is not present?

Revision history for this message
Pekka Jääskeläinen (pekka-jaaskelainen) wrote :

It seems to have correct looking (libtool) linkage flags:
AM_LDFLAGS = -module -export-dynamic `@LLVM_CONFIG@ --ldflags`

Although I do not know what the -export-dynamic does extra here.

Maybe you have static libraries of LLVM and it just doesn't link against all the LLVM libraries required. As I do not recall touching any build files for the passes, I suppose this is a new problem from later LLVM 3.1 instead of a pocl issue. Or there's a bug in the MacOS LLVM dynamic library creation and it doesn't link that symbol in for some reason. Vlado do you have this issue on Mac?

Revision history for this message
Vladimir Guzma (vladimir-guzma) wrote :

I do not get error from first message, however it seems you may have LLVM problem

c++filt __ZN4llvm2cl3optISsLb0ENS0_6parserISsEEE4doneEv
llvm::cl::opt<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, false, llvm::cl::parser<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::done()

What command line arguments you used when building LLVM, and what OSX version you are using.

BTW, the .so files are indeed messed up, if you look for example into lib/CL/.libs there are some dylib but also .so files, I suspect created by explicit link commands in some Makefile.am. They point to precisely nothing.

Revision history for this message
Vladimir Guzma (vladimir-guzma) wrote :

One more thing, can you try to add to your LD flags "-undefined dynamic_lookup". That is how linking
 problems are solved in TCE. Not sure however if that is correct solution here.

Revision history for this message
Carlos Sánchez de La Lama (csanchezdll) wrote :

The llvmopencl.si is actually built using libtool. But I remember there were some problems as libtool detected extension not being .dylib and that changed its behavior under Mac OS X. I remember changing this several times to make it work on OS X and Linux with the same .so extension (it is not a dynamic library but a module/plugin, after all).

Revision history for this message
Erik Schnetter (schnetter) wrote :

I found the solution -- things are working for me on Mac OSX again (at least somewhat):

ERROR: 36 tests were run,
15 failed (2 expected failures).
17 tests were skipped.

I need to build llvm 3.1 by myself, as described in the pocl install instructions (--enable-shared, and REQUIRE_RTTI). I then need to configure pocl without --disable-icd.

That's deceptively simple; I don't understand why I didn't find this out before. Maybe something else changed, maybe I had leftover bits from previous attempts that caused problems.

My source tree differs from the pocl source tree mostly by having removed all -U__APPLE__, plus the changes (described elsewhere) to cl_platform.h, plus some unrelated changes because some of my patches have been applied in a slightly modified form.

Revision history for this message
Pekka Jääskeläinen (pekka-jaaskelainen) wrote :

Can this be closed?

Revision history for this message
Erik Schnetter (schnetter) wrote :

Yes, things work with the recipe above.

Changed in pocl:
status: New → Fix Committed
Changed in pocl:
assignee: nobody → Carlos Sánchez de La Lama (csanchezdll)
Revision history for this message
Carlos Sánchez de La Lama (csanchezdll) wrote :

I am getting a similar problem with symbol
__ZN4llvm10ModulePass17assignPassManagerERNS_7PMStackENS_15PassManagerTypeE
in pocl-0.6.

Using shared libraries solves all symbol problems because in that case all the LLVM API symbols are in the shlib which is linked at runtime. The problem with staid libraries is that if a symbol needed by the llvmopencl plugin is no in opt, the plugin needs to be linked with LLVM libs, which will link in other symbols which *do* are present in opt, giving duplicate symbols.

Some investigation in this case (__ZN4llvm10ModulePass17assignPassManagerERNS_7PMStackENS_15PassManagerTypeE).
- This symbols is in LLVMCore.a (PassManager.o), which cannot be linked into llvmopencl.so because will cause duplicated symbols when opt loads the plugin.
- The symbols is present in opt after LLVM build (Release+Asserts/bin/opt)
- Symbol is stripped from installed opt.

I am trying to check where and why the stripping happens.

Changed in pocl:
status: Fix Committed → In Progress
importance: Undecided → Medium
Revision history for this message
Carlos Sánchez de La Lama (csanchezdll) wrote :

I think this is finally fully diagnosed. There are 3 different problems (but easy to mix because we are compiling a compiler which can be automatically used every time you recompile... :S)

1) Default clang in Mac OS X Lion generates external template instantiation with local visibility, when compiling llvm-3.1. So when loading pocl plugin, symbols corresponding to cl::opt<xxx> types used to parse command line parameters are not found. This happens with both static & dynamic builds of llvm. To solve this, compile llvm-3.1 with clang-3.1 or gcc.

2) Release builds of llvm strip opt when installing, so some symbols needed by pocl plugin are not found. I have reports this to llvmdev (can be easily fixed by adding KEEP_SYMBOLS := 1 to llvm/tools/opt/Makefile). This can be avoided with that patch or using Debug build of llvm.

3) Even when two first problems are avoided, if llvm-3.1 is compiled with a previously built clang-3.1 an error occurs in CommandLine library when parsing string parameters ("pointer being freed was not allocated"). I have googled a bit and seems to be happening to others, at least in the past.

Summing up, to use pocl on Mac OS X, *use gcc to compile llvm-3.1*. You have to force it otherwise it uses clang if found:
../../src/llvm-3.1/configure <whatever> CC=gcc CXX=g++
I am successfully using MacPorts gcc-4.6.
Also, either patch the opt Makefile not to strip symbols or use a shared llvm build (--enable-shared).

Closing the bug as it is not pocl-related.

Changed in pocl:
status: In Progress → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.