unit-tests built with clang crash in MirClientSurfaceTest.client_buffer_created_on_surface_creation

Bug #1157603 reported by Daniel van Vugt on 2013-03-20
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
LLVM
Confirmed
Medium
Mir
Invalid
Medium
Unassigned
clang (Ubuntu)
Medium
Unassigned

Bug Description

unit-tests built with clang crash in MirClientSurfaceTest.client_buffer_created_on_surface_creation.

cmake .. -DCMAKE_C_COMPILER=clang \
           -DCMAKE_CXX_COMPILER=clang++ \
           -DMIR_DISABLE_INPUT=ON
make
bin/unit-tests

...
[----------] 6 tests from MirClientSurfaceTest
[ RUN ] MirClientSurfaceTest.client_buffer_created_on_surface_creation
pure virtual method called
terminate called without an active exception
pure virtual method called
terminate called recursively
Aborted (core dumped)

In , Cmaloney (cmaloney) wrote :

$cat thread.cc
#include <thread>

void f(){}

int main()
{
   std::thread t(f);
   t.join();
}

$clang++ -v
clang version 3.2 (trunk)
Target: x86_64-unknown-linux-gnu
Thread model: posix
$clang++ -std=c++11 thread.cc -o thread -pthread

$./thread
pure virtual method called
terminate called without an active exception
Aborted

In , Cmaloney (cmaloney) wrote :

I just updated clang to r156503, and it seems to work for me now. Closing.

In , Cmaloney (cmaloney) wrote :

Turns out it was just a fluke, and is indeed still broken for me. Not sure how/why it worked...

Created attachment 8832
Valgrind's output

I've cloned the svn repository and this problem persists. This is my clang version:

$ clang++ -v
clang version 3.2 (trunk 159725)
Target: x86_64-unknown-linux-gnu
Thread model: posix

It looks like the ref count of the shared_ptr that holds the internal representation of std::thread gets to 0, so the thread impl is getting destroyed. I can't figure out why this example does actually work on g++ 4.7, since both compilers use the same headers/libs. This could be a bug in libstdc++'s std::thread implementation, if this code didn't work fine on g++.

In , mokabar (tim-klingt) wrote :

same issue here:
clang version 3.2
Target: x86_64-unknown-linux-gnu
Thread model: posix

In , Zhouyan1014 (zhouyan1014) wrote :

Same problem with

clang version 3.3 (trunk 168286)
Target: x86_64-unknown-linux-gnu
Thread model: posix

libstdc++ come with GCC 4.7.2

I suspect this is a problem with some projects lifetime. Some objects' (possibly from libstdc++'s implementation of std::thread) member function gets called while it is being destroyed. I had seen similar problem with G++ before. But that is a long time ago using pthread without C++11. But the error message is suspiciously similar.

However debugger does little to help. I cannot figure out which member function has being called.

In , A-me-w (a-me-w) wrote :

Same problem with

clang version 3.2 (branches/release_32 170508)
Target: x86_64-unknown-linux-gnu
Thread model: posix
OS: Fedora 17
gcc version: 4.7.2 20120921 (Red Hat 4.7.2-2)

the backtrace:

#0 0x0000003e51a35935 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x0000003e51a370e8 in __GI_abort () at abort.c:91
#2 0x0000003e5ce60dad in __gnu_cxx::__verbose_terminate_handler () at ../../../../libstdc++-v3/libsupc++/vterminate.cc:95
#3 0x0000003e5ce5eea6 in __cxxabiv1::__terminate (handler=<optimized out>)
    at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:40
#4 0x0000003e5ce5eed3 in std::terminate () at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:50
#5 0x0000003e5ce5f9ef in __cxxabiv1::__cxa_pure_virtual () at ../../../../libstdc++-v3/libsupc++/pure.cc:50
#6 0x0000003e5ceb29e0 in std::(anonymous namespace)::execute_native_thread_routine (__p=<optimized out>)
    at ../../../../../libstdc++-v3/src/c++11/thread.cc:73
#7 0x0000003e51e07d14 in start_thread (arg=0x7ffff7fe6700) at pthread_create.c:309
#8 0x0000003e51af168d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

62 namespace
63 {
64 extern "C" void*
(gdb) l
65 execute_native_thread_routine(void* __p)
66 {
67 thread::_Impl_base* __t = static_cast<thread::_Impl_base*>(__p);
68 thread::__shared_base_type __local;
69 __local.swap(__t->_M_this_ptr);
70
71 __try
72 {
73 __t->_M_run(); //abort here
74 }
(gdb) l
75 __catch(...)
76 {
77 std::terminate();
78 }
79
80 return 0;
81 }
82 }
83

it seems that __t->_M_run is a pure virtual function, so the process have been terminated.

Download full text (4.1 KiB)

The bug shows up as a pure virtual call, but the real error is use-after-free caused by apparent miscompilation of shared_ptr<>.

I've done deep repro on this, and what I find is that when creating a new thread, libstdc++ std:thread creates an impl object containing the virtual _M_run() function to run the thread's code. An odd detail is that this impl object contains a shared_ptr<> to itself; it's a shared_ptr<base> inside the base type, and the created impl object is always a type derived from this base. This approach, while weird, ensures that the impl object stays alive until the new thread starts running and uses it. Obviously since the impl object contains the virtual _M_run() that actually runs the thread's code, this is quite important.

But this goes wrong because Clang destroys the impl in the parent thread, before the child thread can use it.

The apparently miscompiled code is in <thread>; I quote here from libstdc++ 4.7.2:

129 template<typename _Callable, typename... _Args>
130 explicit
131 thread(_Callable&& __f, _Args&&... __args)
132 {
133 _M_start_thread(_M_make_routine(std::__bind_simple(
134 std::forward<_Callable>(__f),
135 std::forward<_Args>(__args)...)));
136 }

Clang destroys the impl with a backtrace that starts at shared_ptr::_M_destroy() ends at line 135 above. Remember, since the impl contains a shared_ptr to *itself*, the reference count should never fall below one until the thread exits, so this destruction is a mistake.

Running under valgrind confirms that there is a use-after-free here.

I don't know why Clang would do this.

Why does this show up as a pure virtual call? Simple: In the process of orderly destruction, the impl's vtable is reset to its base class vtable before the object is finally freed. That base vtable contains a pure virtual function, which is subsequently called. (Even though the memory was freed, the vtable pointer seems to remain unchanged.) I suppose this symptom would not occur if e.g. the base impl class had a pure virtual constructor.

Here's the backtrace, showing premature destruction. We can be sure that this destruction is premature because it's supposed to happen in the child thread, not in the parent.

(gdb) where
#0 std::_Sp_counted_ptr_inplace<std::thread::_Impl<std::_Bind_simple<void (*())()> >, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*())()> > >, (__gnu_cxx::_Lock_policy)1>::_M_destroy() (this=0x804e008)
    at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr_base.h:418
#1 0x08048f4f in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)1>::_M_release (this=0x804e008)
    at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr_base.h:164
#2 0x08048ecb in std::__shared_count<1>::~__shared_count (this=0xbffff464)
    at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr_base.h:558
#3 0x08048e97 in std::__shared_count<1>::~__shared_count (this=0xbffff464)
    at /usr/bin/../lib/gcc/i486-linux-gnu/4.7/../../../../include/c++/4.7/bits/shared_ptr_base.h:556
#4 0x08048fac in std::__shared_pt...

Read more...

Created attachment 9770
backtrace of premature destruction

I think the root problem is that GCC is defining the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248] macros, whereas Clang is not.

In libstdc++, std::shared_ptr<T> is derived from
std::__shared_ptr<T, __gnu_cxx::__default_lock_policy>. __default_lock_policy is a compile-time constant defined in the include/ext/concurrence.h header. The
value varies depending upon which macros are defined:

  // Compile time constant that indicates prefered locking policy in
  // the current configuration.
  static const _Lock_policy __default_lock_policy =
#ifdef __GTHREADS
#if (defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_2) \
     && defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4))
  _S_atomic;
#else
  _S_mutex;
#endif
#else
  _S_single;
#endif

With G++ on x86 and x86_64, the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248] macros are defined by the compiler, in gcc/c-family/c-cppbuiltin.c, so __default_lock_policy is _S_atomic.

With Clang, the macros are not defined, so __default_lock_policy is _S_mutex.

(The built-in macros can be shown by passing -dM -E to gcc/clang. e.g.: gcc -dM -E - < /dev/null)

With the thread.c test case, large parts of libstdc++ are inlined and therefore compiled by Clang. The std::thread constructor calls std::thread::_M_make_routine, which creates an std::__shared_ptr<T, _S_mutex> object. The constructor passes this object to std::thread::_M_start_thread, which is an out-of-line function in libstdc++.so.6. std::thread::_M_start_thread expects an std::__shared_ptr<T, _S_atomic> object. I think this mismatch is what causes the premature destruction of the std::thread::_Impl object.

Here are the most relevant bits from the 4.7.2 include/std/thread header:

  class thread
  {
  public:
    ...
    struct _Impl_base;
    typedef shared_ptr<_Impl_base> __shared_base_type;
    ...
    template<typename _Callable, typename... _Args>
      explicit
      thread(_Callable&& __f, _Args&&... __args)
      {
        _M_start_thread(_M_make_routine(std::__bind_simple(
                std::forward<_Callable>(__f),
                std::forward<_Args>(__args)...)));
      }
    ...
  private:
    void
    _M_start_thread(__shared_base_type);

    template<typename _Callable>
      shared_ptr<_Impl<_Callable>>
      _M_make_routine(_Callable&& __f)
      {
 // Create and allocate full data structure, not base.
 return std::make_shared<_Impl<_Callable>>
            (std::forward<_Callable>(__f));
      }
  };

If I define the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248] macros when I build the test case with Clang, then the test case passes, and valgrind is happy.

This ABI issue resulted in an (invalid) gcc bug report,
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42734. There seems to be some acknowledgement there that the macros are expected to change the libstdc++ ABI.

I think the driver needs to define the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248] macros for libstdc++ compatibility. There is an issue open regarding these macros, but it does not mention ABI compatibility,
http://llvm.org/bugs/show_bug.cgi?id=11174.

Ryan, you are a gentleman and a scholar. Thanks

Alexandros Frantzis (afrantzis) wrote :

This is caused by http://llvm.org/bugs/show_bug.cgi?id=12730

Defining the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248] macros solves this problem, but I am not convinced that this is the right way forward.

Changed in mir:
status: New → Confirmed
Changed in mir:
status: Confirmed → Triaged
Changed in llvm:
importance: Unknown → Medium
status: Unknown → Confirmed

Fixed in r178816.

Changed in llvm:
status: Confirmed → Fix Released

Im getting the pure virtual method call as well when releasing the connection to the server:
http://paste.ubuntu.com/5695233/

Here is the video quit code (though its simple...):
http://paste.ubuntu.com/5711507/

And how I connect to the server:
this->hidden->connection = mir_connect_sync(NULL, __PRETTY_FUNCTION__);

tags: added: clang
Daniel van Vugt (vanvugt) wrote :

Fixed in saucy's clang: 3.2-7ubuntu1

Changed in mir:
status: Triaged → Invalid
Changed in clang (Ubuntu):
status: New → Fix Released
importance: Undecided → Medium
In , S-llvm (s-llvm) wrote :

@Michael Spencer: Is there a good reason why your fix adding the __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248] macros in SVN rev 178816 is x86 only?

If you look at http://sourceware.org/ml/libc-ports/2010-11/msg00005.html you'll see that on ARM the use of atomic builtins only happens if __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 is set.

This rather implies the need for every architecture target, not just x86, to set __GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248] macros according to whether that target provides CAS ops of varying sizes.

FYI I just got bit on this bug but with an ARM target, hence me finding this report.

Niall

Changed in llvm:
status: Fix Released → Confirmed
In , Weimingz (weimingz) wrote :

I posted the patch for ARM to clang-commits maillist.

In , Weimingz (weimingz) wrote :

For x86, it looks "__GCC_HAVE_SYNC_COMPARE_AND_SWAP_8" only apply to i586 or above. So we can't define the 4 macros for all targets.
I don't know if it is safe to define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_{1,2,4} for all targets.

And I can't find a way to compute the macros for each target either

In , S-llvm (s-llvm) wrote :

(In reply to comment #16)
> For x86, it looks "__GCC_HAVE_SYNC_COMPARE_AND_SWAP_8" only apply to i586 or
> above. So we can't define the 4 macros for all targets.
> I don't know if it is safe to define
> __GCC_HAVE_SYNC_COMPARE_AND_SWAP_{1,2,4} for all targets.
>
> And I can't find a way to compute the macros for each target either

ARMv7a has 64 bit atomics, so you can turn on __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8 for that and above.

Niall

In , Weimingz (weimingz) wrote :

Should it be tied to MaxAtomicPromoteWidth ?
However, the value of MaxAtomicPromoteWidth is Target specific, not CPU specific.

In , Weimingz (weimingz) wrote :

ARM part: patched in r191707

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.