Pulseaudio applications hang (Totem, GNOME Shell etc.)

Bug #1085342 reported by Doug McMahon on 2012-12-01
134
This bug affects 23 people
Affects Status Importance Assigned to Milestone
GLibC
Fix Released
Medium
eglibc (Debian)
Fix Released
Unknown
eglibc (Ubuntu)
Undecided
Adam Conrad
Precise
Undecided
Adam Conrad
Quantal
Undecided
Adam Conrad
totem (Ubuntu)
Undecided
Unassigned
Precise
Undecided
Unassigned
Quantal
Undecided
Unassigned

Bug Description

At least so here.
Best way to see would be to use totem for a while, window usually becomes unresponsive fairly soon.
Possible Test case(s):
Add a couple of music tracks to the playlist, switch from one to the other & back again with next & previous buttons. (may cause lockup

Close Totem(Videos), re-open & add back the same tracks & or try them from the "Movie"
 menu dropdown.
Usually here they won't play & or be switchable, ect.

Add a couple of tracks to playlist, click on track names a few times to start playback/switch tracks

Doesn't seem to matter if totem is started normally (no app menu in ubuntu session), with the app menu (Exec=env UBUNTU_MENUPROXY=0 totem) or from the sound menu

With the pulseaudio plugin removed & the gst alsa plugin installed totem behaves much better though on very rare occasion will become unresponsive.

ProblemType: Bug
DistroRelease: Ubuntu 13.04
Package: totem 3.6.3-0ubuntu1
ProcVersionSignature: Ubuntu 3.7.0-4.12-generic 3.7.0-rc7
Uname: Linux 3.7.0-4-generic x86_64
ApportVersion: 2.6.3-0ubuntu2
Architecture: amd64
Date: Sat Dec 1 01:49:22 2012
InstallationDate: Installed on 2012-10-23 (38 days ago)
InstallationMedia: Ubuntu 12.10 "Quantal Quetzal" - Release amd64 (20121017)
MarkForUpload: True
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: totem
UpgradeStatus: No upgrade log present (probably fresh install)

CVE References

Since commit c5a0802a682dba23f92d47f0f99775aebfbe2539 (Handle EAGAIN from FUTEX_WAIT_REQUEUE_PI), pulseaudio apps often hang.

Example backtrace from gnome-shell and libcanberra:

Thread 1 (Thread 0x7f975cfe8900 (LWP 2367)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:144
#1 0x00007f97558a4aa0 in pa_threaded_mainloop_wait (m=0x21bdf00) at pulse/thread-mainloop.c:206
#2 0x00007f97460aaa0b in pulse_driver_play (c=0x21b5280, id=<optimized
out>, proplist=0x704c9c0, cb=<optimized out>, userdata=<optimized
out>) at pulse.c:1085
#3 0x00007f975955624e in ca_context_play_full (c=c@entry=0x21b5280,
id=id@entry=1, p=0x704c9c0, cb=cb@entry=0x0,
userdata=userdata@entry=0x0) at common.c:522
#4 0x00007f97595565cf in ca_context_play (c=0x21b5280, id=1) at common.c:462
...

Most distributions (Fedora, Debian, openSUSE) revert this patch.

Quoting an email from Jeff below
http://sourceware.org/ml/libc-alpha/2012-01/msg00002.html

An FYI, this patch:

commit c5a0802a682dba23f92d47f0f99775aebfbe2539
Author: Andreas Schwab <email address hidden>
Date: Mon Nov 28 13:38:19 2011 +0100

    Handle EAGAIN from FUTEX_WAIT_REQUEUE_PI

Has been reported as causing numerous problems in Fedora & Debian. I
don't think anyone has done any serious analysis of the issue, but the
patch has been pulled from both distributions because of the
instability it's introduced.

https://bugzilla.redhat.com/show_bug.cgi?id=769421
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=651899

It might be worth your time to dig further into the change or pull it
from 2.15 pending a deeper investigation.

This is going to be a tricky issue to resolve.

We don't have a smaller test case than "stuff stops working well", but clearly the range of failures means that the changes in libc are broken.

Have you found a smaller test case that uses just pthreads?

In the meantime I do think the right solution is to revert the patch. We'll see if we can do that for 2.17.

Olivier,
Thanks for opening the report. It's been on my TODO list for a while.

Everyone else,
If someone has a good understanding of the core pthread_condattr_t structure, particularly the relationships between the _seq fields, documenting them might be a significant help.

One of the things that was incredibly frustrating trying to debug this issue was the inadequate documentation of key data structures.

Carlos: sorry, I don't have a more specific test case. I just noticed the pulsaudio issue after we upgraded from glibc 2.14 to 2.16 in Mageia, and that all other mainstream distributions revert his commit

pulseaudio is the current best known way to reproduce the problem.

AUDIODRIVER=pulseaudio play -n -c1 synth whitenoise band -n 100 20 \
        band -n 50 20 gain +25 fade h 1 864000 1

Fails once or twice every ten attempts within the first few seconds. I was able to make it fail regularly with a strategic breakpoint in the low level pthread code after releasing on the locks (details fade, but clearly it depends on arranging the the kernel to return EAGAIN to the futex call).

I would hesitate to try fixing this without also looking at and possibly fixing bug #12875. The whole sequence number wakeup thing for condition variables is fundamentally broken in NPTL and needs to be fixed. Basically the issue is that the current code is over-engineered to avoid spurious wakeups, but in the process it suppresses some non-spurious wakeups...

(In reply to comment #6)
> I would hesitate to try fixing this without also looking at and possibly fixing
> bug #12875. The whole sequence number wakeup thing for condition variables is
> fundamentally broken in NPTL and needs to be fixed.

How so? You seem to see issues that go beyond bug #12875 (which I don't see as being a bug currently). If so, please link to them here.

> Basically the issue is that
> the current code is over-engineered to avoid spurious wakeups, but in the
> process it suppresses some non-spurious wakeups...

I'm not aware of any lost wake-ups for non-PI cond vars. If you have provided an alternative implementation proposal in the past, could you link to it when making such comments please? If you haven't but have a proposal now, please link to it here or post to glibc-alpha.

Doug McMahon (mc3man) wrote :
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in totem (Ubuntu):
status: New → Confirmed
Doug McMahon (mc3man) wrote :

If I bulid & install totem-3.4.3 & use in 13.04 with gst-0.10 the same thing happens if gstreamer0.10-pulseaudio is installed, constantly becomes unresponsive

Roman Yepishev (rye) wrote :

Backtrace (with some bits missing) while totem is hanging, it actually looks like a pulseaudio failure or gstreamer pulseaudio plugin does something out of order while seeking/playing that makes pulseaudio hang.

#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:144
#1 0x00007fbfcc70f180 in pa_threaded_mainloop_wait (m=0x7fbfc8024740) at pulse/thread-mainloop.c:206
#2 0x00007fbfcc951504 in gst_pulsering_set_corked (pbuf=pbuf@entry=0x7fbfc8028070, corked=corked@entry=1, wait=wait@entry=1) at pulsesink.c:1053
#3 0x00007fbfcc952d3a in gst_pulseringbuffer_pause (buf=0x7fbfc8028070) at pulsesink.c:1172
#4 0x00007fbff717a65e in ?? () from /usr/lib/x86_64-linux-gnu/libgstaudio-1.0.so.0
#5 0x00007fbff717d700 in gst_audio_ring_buffer_pause () from /usr/lib/x86_64-linux-gnu/libgstaudio-1.0.so.0
#6 0x00007fbff71967c6 in ?? () from /usr/lib/x86_64-linux-gnu/libgstaudio-1.0.so.0
#7 0x00007fbfcc9544f8 in gst_pulsesink_change_state (element=0x7fbfc8025a10, transition=GST_STATE_CHANGE_PLAYING_TO_PAUSED) at pulsesink.c:2916
#8 0x00007fbff6c7faec in gst_element_change_state (element=element@entry=0x7fbfc8025a10, transition=<optimized out>) at gstelement.c:2594
#9 0x00007fbff6c804c1 in gst_element_set_state_func (element=0x7fbfc8025a10, state=GST_STATE_PAUSED) at gstelement.c:2550
#10 0x00007fbff6c682ac in gst_bin_element_set_state (next=GST_STATE_PAUSED, current=GST_STATE_PLAYING, start_time=1361340000, base_time=6136832000, element=0x7fbfc8025a10, bin=0x7fbffb36f700)
    at gstbin.c:2308

Roman Yepishev (rye) on 2013-01-15
summary: - Totem window constantly becomes unresponsive with
- gstreamer1.0-pulseaudio installed
+ Pulseaudio applications are hanging (Totem, GNOME Shell etc.)
summary: - Pulseaudio applications are hanging (Totem, GNOME Shell etc.)
+ Pulseaudio applications hang (Totem, GNOME Shell etc.)
Changed in glibc:
importance: Unknown → Medium
status: Unknown → Fix Released
Adam Conrad (adconrad) on 2013-01-16
affects: glibc (Debian) → eglibc (Debian)
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Adam Conrad (adconrad) on 2013-01-16
affects: glibc (Ubuntu) → eglibc (Ubuntu)
Changed in eglibc (Ubuntu):
assignee: nobody → Adam Conrad (adconrad)
status: New → Fix Committed
Changed in eglibc (Ubuntu):
status: New → Confirmed
Changed in eglibc (Debian):
status: Unknown → Fix Committed
Doug McMahon (mc3man) wrote :

Possibly this other bug I have specific to gnome-shell & closing any open window is related to this & a dupe ??
Bug 1094571

Doug McMahon (mc3man) wrote :

Am for the moment marking above mentioned bug as dupe here based on a rebuild of current eglibc source
(reverted debian local-pthread_cond_wait patches, applied upstream diff, ect.)
With the redone libc6 package installed don't see any hangs in totem nor system hangs when closing windows in gnome-shell, reverting back to current package(s) & the hangs come right back.

Iain Lane (laney) on 2013-01-24
Changed in totem (Ubuntu):
status: Confirmed → Invalid
Launchpad Janitor (janitor) wrote :
Download full text (7.3 KiB)

This bug was fixed in the package eglibc - 2.17-0ubuntu1

---------------
eglibc (2.17-0ubuntu1) raring; urgency=low

  * Merge with Debian, bringing in a new upstream and many small fixes:
    - patches/any/cvs-malloc-deadlock.diff: Dropped, merged upstream.
    - patches/ubuntu/lddebug-scopes.diff: Rebase for upstream changes.
    - patches/ubuntu/local-CVE-2012-3406.diff: Rebased against upstream.
    - patches/ubuntu/no-asm-mtune-i686.diff: Fixed in recent binutils.
  * This upstream merge fixes a nasty hang in pulseaudio (LP: #1085342)
  * Bump MIN_KERNEL_SUPPORTED to 2.6.32 on ARM, now that we no longer
    have to support shonky 2.6.31 kernels on imx51 babbage builders.
  * Drop patches/ubuntu/local-disable-nscd-host-caching.diff, as these
    issues were apparently resolved upstream a while ago (LP: #613662)
  * Fix the compiled-in bug URL to point to launchpad.net, not Debian.

eglibc (2.17-0experimental0) experimental; urgency=low

  [ Adam Conrad ]
  * New upstream release: version 2.17, orig tarball built at SVN r22169:
    - Restricts ld.so self-loading checks to normal mode (LP: #1088677)
    - debian/rules.d/tarball.mk: ports is no longer external to libc.
    - debian/*: Update all 2.16 occurences to 2.17 for upgrades/deps.
    - patches/localedata/supported.diff: Rebased against new upstream.
    - patches/localedata/locale-ia.diff: Dropped, merged upstream.
    - patches/localedata/submitted-es_MX-decimal_point.diff: Rebased.
    - patches/amd64/local-pthread_cond_wait.diff: Dropped, fixed upstream.
    - patches/i386/local-pthread_cond_wait.diff: Dropped (closes: #694962)
    - patches/arm64/cvs-ldconfig-cache-abi.diff: Dropped, merged upstream.
    - patches/arm64/submitted-aarch64-support.diff: Merged upstream.
    - patches/arm/cvs-ldconfig-cache-abi.diff: Dropped, merged upstream.
    - patches/arm/local-atomic.diff: Dropped, fixed differently upstream.
    - patches/arm/unsubmitted-armhf-linker.diff: Dropped, not needed.
    - patches/arm/unsubmitted-ldconfig-cache-abi.diff: Rewritten slightly.
    - patches/hppa/submitted-nptl-carlos.diff: Rebased against upstream.
    - patches/hppa/local-stack-grows-up.diff: Rebased against upstream.
    - patches/hurd-i386/local-enable-ldconfig.diff: dl-cache.c dropped.
    - patches/hurd-i386/tg-tls.diff: Rebase and drop powerpc support.
    - patches/hurd-i386/tg-regenerate_errno.h.diff: Merged upstream.
    - patches/hurd-i386/tg-extern_inline.diff: Drop powerpc support.
    - patches/hurd-i386/tg-elfosabi_gnu.diff: Drop powerpc support.
    - patches/hurd-i386/tg-grantpt.diff: Rebased against new upstream.
    - patches/hurd-i386/unsubmitted-pthread_posix-option.diff: Rebased.
    - patches/hurd-i386/submitted-getgroups.diff: Dropped, merged upstream.
    - patches/hurd-i386/submitted-getlogin_r.diff: Dropped, fixed upstream.
    - patches/hurd-i386/submitted-ptsname.diff: Dropped, merged upstream.
    - patches/hurd-i386/submitted-sendto.diff: Dropped, fixed upstream.
    - patches/hurd-i386/cvs-add-missing-includes.diff: Merged upstream.
    - patches/hurd-i386/cvs-mach-check-local-headers.sh.diff: Merged.
    - patches/hurd-i386/cvs-lremovexattr.diff: Dropped, merged upstrea...

Read more...

Changed in eglibc (Ubuntu):
status: Fix Committed → Fix Released
madbiologist (me-again) wrote :

I'm pretty sure I have been getting a lot of this on Quantal and Precise. I can also confirm this statement from the bug description applies on Quantal and Precise - "With the pulseaudio plugin removed & the gst alsa plugin installed totem behaves much better though on very rare occasion will become unresponsive."

Any chance of SRU's for Quantal and Precise?

Changed in eglibc (Debian):
status: Fix Committed → Fix Released
Adam Conrad (adconrad) wrote :

A friend of mine has been seeing this on precise, which means I now have a test case for testing a backport of the fix. I'll prep a precise upload after the current SRU is through.

Changed in totem (Ubuntu Precise):
status: New → Invalid
Changed in totem (Ubuntu Quantal):
status: New → Invalid
Changed in eglibc (Ubuntu Precise):
assignee: nobody → Adam Conrad (adconrad)
Changed in eglibc (Ubuntu Quantal):
assignee: nobody → Adam Conrad (adconrad)
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in eglibc (Ubuntu Precise):
status: New → Confirmed
Changed in eglibc (Ubuntu Quantal):
status: New → Confirmed
Rolf Leggewie (r0lf) wrote :

quantal has seen the end of its life and is no longer receiving any updates. Marking the quantal task for this ticket as "Won't Fix".

Changed in eglibc (Ubuntu Quantal):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.