[pulsesink] abort at pthread_mutex_unlock(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:118, function pa_mutex_unlock() with libc 2.21

Bug #1436162 reported by Ricardo Salveti on 2015-03-25
50
This bug affects 9 people
Affects Status Importance Assigned to Milestone
glibc (Ubuntu)
Undecided
Adam Conrad
gst-plugins-good1.0 (Ubuntu)
Undecided
Unassigned
pulseaudio (Ubuntu)
Undecided
Unassigned

Bug Description

current build number: 146
device name: m75
channel: ubuntu-touch/vivid-proposed
last update: 2015-03-24 13:57:03
version version: 146

After libc 2.21 landed, pulsesink fails constantly when pausing or seeking the active pipeline.

Abort message:
Assertion 'pthread_mutex_unlock(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:118, function pa_mutex_unlock(). Aborting.

This is basically happens when it tries to unlock a mutex that is already unlocked.

This issue doesn't happen with libc 2.19.

To reproduce the issue:
GST_DEBUG=*pulse*:5 gst-launch-1.0 playbin uri=file:///tmp/foobar.mp3

Then when playing, just hit control+c, which will then pause the pipeline, causing the crash. The crash happens with most of the times you pause or seek the pipeline.

affects: libc (Ubuntu) → glibc (Ubuntu)
description: updated
Ricardo Salveti (rsalveti) wrote :
Download full text (4.9 KiB)

The mutex in question is the one used by pulse's mainloop, which gets used by pa_threaded_mainloop_lock, pa_threaded_mainloop_unlock and pa_threaded_mainloop_wait.

As the pulseaudio calls are all asynchronous, gst (pulsesink element) calls pa_threaded_mainloop_wait right after calling pulseaudio, so it can block the thread until pulseaudio returns via callback (which then calls pa_threaded_mainloop_signal, unblocking the thread).

The mutex is also recursive, allowing many lock and unlock calls to happen in parallel.

Follows the log when I'm able to reproduce the crash (with the mutex recursive count):
Interrupt: Stopping pipeline ...
Execution ended after 0:00:06.840520615
Setting pipeline to PAUSED ...
(thread id: 3069599744) pa_threaded_mainloop_lock -> mutex: 0xb5250e00
(thread id: 3069599744) pa_mutex_lock -> m: 0xb5250e00 (count prev 0)
(thread id: 3069599744) pa_mutex_lock -> m: 0xb5250e00 (count new 1)
(thread id: 3069599744) pa_threaded_mainloop_unlock -> mutex: 0xb5250e00
(thread id: 3069599744) pa_mutex_unlock -> m: 0xb5250e00 (count prev 1)
(thread id: 3069599744) pa_mutex_unlock -> m: 0xb5250e00 (count new 0)
0:00:07.586470308 12439 0x15d1dc0 DEBUG pulse pulsesink.c:1272:gst_pulseringbuffer_pause:<pulsesink1> Pausing and corking, before calling mainloop lock
(thread id: 3069599744) pa_threaded_mainloop_lock -> mutex: 0xb5250e00
(thread id: 3069599744) pa_mutex_lock -> m: 0xb5250e00 (count prev 0)
(thread id: 3069599744) pa_mutex_lock -> m: 0xb5250e00 (count new 1)
0:00:07.586854770 12439 0x15d1dc0 DEBUG pulse pulsesink.c:1135:gst_pulsering_set_corked:<pulsesink1> Setting corked state to 1
0:00:07.587025308 12439 0x15d1dc0 DEBUG pulse pulsesink.c:1142:gst_pulsering_set_corked:<pulsesink1> Calling pa_threaded_mainloop_wait (PA_OPERATION_RUNNING)
(thread id: 3069599744) pa_threaded_mainloop_wait -> cond: 0xb52517b8, mutex: 0xb5250e00
(thread id: 3069599744) pa_cond_wait -> m: 0xb5250e00 (count before wait 1)
(thread id: 3018630240) poll_func lock -> mutex: 0xb5250e00
(thread id: 3018630240) pa_mutex_lock -> m: 0xb5250e00 (count prev 0)
(thread id: 3018630240) pa_mutex_lock -> m: 0xb5250e00 (count new 1)
(thread id: 3018630240) poll_func unlock -> mutex: 0xb5250e00
(thread id: 3018630240) pa_mutex_unlock -> m: 0xb5250e00 (count prev 1)
(thread id: 3018630240) pa_mutex_unlock -> m: 0xb5250e00 (count new 0)
(thread id: 3018630240) poll_func lock -> mutex: 0xb5250e00
(thread id: 3018630240) pa_mutex_lock -> m: 0xb5250e00 (count prev 0)
(thread id: 3018630240) pa_mutex_lock -> m: 0xb5250e00 (count new 1)
(thread id: 3018630240) poll_func unlock -> mutex: 0xb5250e00
(thread id: 3018630240) pa_mutex_unlock -> m: 0xb5250e00 (count prev 1)
(thread id: 3018630240) pa_mutex_unlock -> m: 0xb5250e00 (count new 0)
(thread id: 3018630240) poll_func lock -> mutex: 0xb5250e00
(thread id: 3018630240) pa_mutex_lock -> m: 0xb5250e00 (count prev 0)
(thread id: 3018630240) pa_mutex_lock -> m: 0xb5250e00 (count new 1)
(thread id: 3018630240) poll_func unlock -> mutex: 0xb5250e00
(thread id: 3018630240) pa_mutex_unlock -> m: 0xb5250e00 (count prev 1)
(thread id: 3018630240) pa_mutex_unlock -> m: 0xb5...

Read more...

Ricardo Salveti (rsalveti) wrote :

It seems the issue only happens when at least 2 threads are waiting for the wait condition (that gets triggered by pa_threaded_mainloop_signal). When I'm unable to make it to crash, it always shows only one thread waiting the signal.

So my current theory is that there might be issues when calling pthread_cond_wait on more than one thread at the same time (maybe when increasing/decreasing the mutex count).

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in glibc (Ubuntu):
status: New → Confirmed
Changed in gst-plugins-good1.0 (Ubuntu):
status: New → Confirmed
Changed in pulseaudio (Ubuntu):
status: New → Confirmed
Ricardo Salveti (rsalveti) wrote :

So this issue is a combination of the following patches:
https://sourceware.org/git/?p=glibc.git;a=commit;h=47c5adebd2c864a098c3af66e61e1147dc3cf0b4
https://sourceware.org/git/?p=glibc.git;a=commit;h=03d41216fe09fc71a2750946af8fe2892bf4feb5

Basically the main problem is that futex_atomic_cmpxchg_inatomic got disabled, even though we have that well supported in our kernels.

I just did another build (in canonical-arm-dev) including the patch bellow, and with that I'm unable to reproduce this bug:
Index: glibc-2.21/sysdeps/unix/sysv/linux/arm/kernel-features.h
===================================================================
--- glibc-2.21.orig/sysdeps/unix/sysv/linux/arm/kernel-features.h
+++ glibc-2.21/sysdeps/unix/sysv/linux/arm/kernel-features.h
@@ -37,8 +37,10 @@
 /* The ARM kernel before 3.14.3 may or may not support
    futex_atomic_cmpxchg_inatomic, depending on kernel
    configuration. */
+/*
 #if __LINUX_KERNEL_VERSION < 0x030E03
 # undef __ASSUME_FUTEX_LOCK_PI
 # undef __ASSUME_REQUEUE_PI
 # undef __ASSUME_SET_ROBUST_LIST
 #endif
+*/

Changed in gst-plugins-good1.0 (Ubuntu):
status: Confirmed → Invalid
Changed in pulseaudio (Ubuntu):
status: Confirmed → Invalid
Ricardo Salveti (rsalveti) wrote :

SMP support for futex_atomic_cmpxchg_inatomic was added in 2.6.38, but it required !CPU_USE_DOMAINS.

Adam Conrad (adconrad) on 2015-03-26
Changed in glibc (Ubuntu):
status: Confirmed → Fix Committed
assignee: nobody → Adam Conrad (adconrad)
tags: added: lt-blocker lt-category-visible
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package glibc - 2.21-0ubuntu4

---------------
glibc (2.21-0ubuntu4) vivid; urgency=medium

  * debian/patches/arm/local-arm-futex.diff: Lie about the minimum
    kernel support for futex_atomic_cmpxchg_inatomic to restore the
    previous state and make pulsesink happier on ARM (LP: #1436162)
  * debian/debhelper.in/libc.preinst: Wipe ld.so.cache on upgrades.
 -- Adam Conrad <email address hidden> Wed, 25 Mar 2015 14:31:51 -0600

Changed in glibc (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers