[SRU] Deadlocks in main loop

Bug #887946 reported by Ante Karamatić
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
glib2.0 (Ubuntu)
Invalid
Undecided
Unassigned
Lucid
Fix Released
Medium
Unassigned
Maverick
Won't Fix
Low
Unassigned

Bug Description

Some applications get stuck in a deadlock when utilizing glib. One of examples is lrmadmin, when connecting to lrmd. Upstream provided a fix for this bug and Ubuntu's version 11.04 and newer do not have this issue.

Ubuntu 10.04 and 10.10 do not contain the fix. Upstream fix:
  https://mail.gnome.org/archives/commits-list/2010-November/msg01816.html

[Impact]
Remove potential and demonstratable deadlocks in glib code.

[Development Fix]
Bug is fixed in Ubuntu 11.04 and later by the simple fact that these version have newer version of glib2.0 which contains this fix.

[Stable Fix]
Proposed fix is cherry picked from the upstream's, at the time, latest version of glib. This part of the code did not change since then.

[Test Case]
 * Install lucid
 * Install python-software-properties:
   apt-get install python-software-properties
 * add ubuntu-ha-maintainers lucid ppa and update repo:
   apt-add-repository ppa:ubuntu-ha-maintainers/ppa ; apt-get update
 * Install pacemaker:
   apt-get -y install pacemaker
 * Enable corosync (/etc/default/corosync) and start it:
   sed -i -e 's/START=no/START=yes/' /etc/default/corosync
   service corosync start
 * Open two client->server connections:
   lrmadmin -C ; lrmadmin -C
   It deadlocks on second run (it actually never finishes the first run).

 * Kill lrmd and stop corosync:
   killall -9 lrmd ; service corosync stop
 * install fix
   apt-get update && apt-get install libglib2.0-0
 * Start corosync:
   service corosync start
 * Run the test again:
   lrmadmin -C ; lrmadmin -C ; lrmadmin -C ; lrmadmin -C
 * It doesn't deadlock.

[Regression Potential]
Regression potential should be very small. This change is still present in almost unmodified upstream code to this day (g_source_unref_internal of glib/gmain.c). The upstream trunk commit is at http://git.gnome.org/browse/glib/commit/?id=b358202856682e5cdefb0b4b8aaed3a45d9a85fa .

Related branches

Revision history for this message
Ante Karamatić (ivoks) wrote :

Test case:

Install lucid
add ubuntu-ha-maintainers lucid ppa and update repo:
        apt-add-repository ppa:ubuntu-ha-maintainers/lucid-cluster ; apt-get update
Install pacemaker:
        apt-get -y install pacemaker
Enable corosync (/etc/default/corosync) and start it:
        sed -i -e 's/START=no/START=yes/' /etc/default/corosync ; \
        service corosync start
Open two client->server connections:
        lrmadmin -C ; lrmadmin -C
It deadlocks on second run (it actually never finishes the first run).

Kill lrmd and stop corosync:
        killall -9 lrmd ; service corosync stop
Add ppa:ivoks/glib (contains only the patch from the bzr branch):
        apt-add-repository ppa:ivoks/glib ; apt-get update ; apt-get -y upgrade
Start corosync:
        service corosync start
Run the test again:
        lrmadmin -C ; lrmadmin -C ; lrmadmin -C ; lrmadmin -C
It doesn't deadlock.

Revision history for this message
Evan Broder (broder) wrote :

I'm unsubscribing ubuntu-sponsors from this bug. Merge-proposals requesting a review from ubuntu-sponsors or ubuntu-branches (the default) automatically show up in our queue for sponsorship. Additionally subscribing ubuntu-sponsors creates an extra entry, so it should only be used when attaching a debdiff to a bug, not when opening a merge proposal

Changed in glib2.0 (Ubuntu):
status: New → Invalid
Changed in glib2.0 (Ubuntu Maverick):
status: New → Won't Fix
importance: Undecided → Low
Scott Moser (smoser)
Changed in glib2.0 (Ubuntu Lucid):
importance: Undecided → Medium
status: New → Confirmed
Scott Moser (smoser)
description: updated
summary: - Deadlocks in main loop
+ [SRU] Deadlocks in main loop
Ante Karamatić (ivoks)
description: updated
Ante Karamatić (ivoks)
description: updated
description: updated
Ante Karamatić (ivoks)
description: updated
Revision history for this message
Chris Halse Rogers (raof) wrote : Please test proposed package

Hello Ante, or anyone else affected,

Accepted glib2.0 into lucid-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/glib2.0/2.24.1-0ubuntu2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in glib2.0 (Ubuntu Lucid):
status: Confirmed → Fix Committed
tags: added: verification-needed
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote : Verification still needed

The fix for the this bug has been awaiting testing feedback in the -proposed repository for lucid for more than 90 days. Please test this fix and update the bug appropriately with the results. In the event that the fix for this bug is still not verified 15 days from now the package will be removed from the -proposed repository.

Revision history for this message
Ante Karamatić (ivoks) wrote :

I've verified the fix:

# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 10.04.4 LTS
Release: 10.04
Codename: lucid
# lrmadmin -C ; lrmadmin -CThere are 5 RA classes supported:
ocf
heartbeat
stonith
upstart
lsb
^C
root@virtual:~# lrmadmin -C ; lrmadmin -C
^C
[add proposed archive]
#apt-get update
[...]
# apt-get install libglib2.0-0
[...]
Get:1 http://archive.ubuntu.com/ubuntu/ lucid-proposed/main libglib2.0-0 2.24.1-0ubuntu2 [1123kB]
[...]
# killall -9 lrmd
# /etc/init.d/corosync restart
 * Restarting corosync daemon corosync [ OK ]
# lrmadmin -C ; lrmadmin -C
There are 5 RA classes supported:
ocf
heartbeat
stonith
upstart
lsb
There are 5 RA classes supported:
ocf
heartbeat
stonith
upstart
lsb
# lrmadmin -C ; lrmadmin -C
There are 5 RA classes supported:
ocf
heartbeat
stonith
upstart
lsb
There are 5 RA classes supported:
ocf
heartbeat
stonith
upstart
lsb
#

tags: added: verification-done
removed: verification-needed
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package glib2.0 - 2.24.1-0ubuntu2

---------------
glib2.0 (2.24.1-0ubuntu2) lucid-proposed; urgency=low

  * debian/patches/90-context-unlock.patch (LP: #887946):
    - gmain: move finalization of GSource outside of context lock
 -- Ante Karamatic <email address hidden> Wed, 09 Nov 2011 10:33:37 +0100

Changed in glib2.0 (Ubuntu Lucid):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.