GNUbg crashes shortly after starting game

Bug #1393105 reported by Michael Petch on 2014-11-16
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
gnubg (Ubuntu)
High
Unassigned
Trusty
High
Unassigned
Utopic
High
Unassigned

Bug Description

[Preamble]
I am one of the primary upstream maintainers of the GNUbg project.

[Impact]
When a user starts a new match GNUbg will crash with this error (on the console):

    Attempt to unlock mutex that was not locked
    Aborted (core dumped)

This issue is a regression that appeared in Ubuntu 14.10 . It wasn't an issue in Ubuntu 14.04. This bug should be a high priority for this application because this bug renders the software completely unusable.

[Test Case]
To reproduce launch GNUbg and click the "New" icon to start a new match. Click "Ok" on the match parameters screen (no need to modify the settings). The match should start. You may need to hit Control-R to roll the dice (alternatively click the middle of the board on the right side). Shortly after doing so the application will close down.

[Regression Potential]
The patch/bug fix attached to this report should be reasonably low risk and should not impact other software on the users system. Most regressions that may result from any new problems that could be introduced are that the analysis output may be slightly different.

Upstream official software releases (1.03.000 and 1.04.000) have been using the fixes in this patch for a few months without any reported issues and as a result are considered well tested by the upstream maintainers.

[Bug cause]
New versions of Glib (>= 2.41.2) have extra assertions that may reveal hidden bugs. The Glib changes highlighted problems in how we were using mutexes (creations/usage/cleanup). Older versions of Glib didn't have these newer assertions so the problem had gone undetected in previous releases although the bug is present but may invoke undefined behaviour.

[Bug history]
The set of changes in this patch were applied to the upstream version of GNUbg in July 2014 based upon a bug report of a user that was using a PPA version of Glib in Ubuntu 14.04. That bug and further information about the nature of this issue can be found here: https://bugs.launchpad.net/ubuntu/+source/gnubg/+bug/1346567 . Russ Allbery the Debian maintainer closed that bug off as fixed by landing it in
Vivid Vervet (15.04) by releasing version 1.03.001-1 .

There was a user report of this bug shortly after 14.10 was release but for some inexplicable reason it was deleted from launchpad.

ProblemType: Bug
DistroRelease: Ubuntu 14.10
Package: gnubg 1.02.000-2
ProcVersionSignature: Ubuntu 3.16.0-24.32-generic 3.16.4
Uname: Linux 3.16.0-24-generic x86_64
ApportVersion: 2.14.7-0ubuntu8
Architecture: amd64
CurrentDesktop: Unity
Date: Sat Nov 15 17:08:35 2014
InstallationDate: Installed on 2014-10-24 (22 days ago)
InstallationMedia: Ubuntu 14.10 "Utopic Unicorn" - Release amd64 (20141022.1)
ProcEnviron:
 LANGUAGE=en_CA:en
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_CA.UTF-8
 SHELL=/bin/bash
SourcePackage: gnubg
UpgradeStatus: No upgrade log present (probably fresh install)

Michael Petch (mpetch) wrote :
Michael Petch (mpetch) wrote :

I am the primary upstream maintainer for GNUbg. Attached is a patch to fix this bug. This patch is a small subset of the changes that went into GNUbg 1.03.000. to deal with this bug and two other fatal problems These changes were related to issues with improper mutex creation/usage/cleanup as well as a couple instability fixes and were brought about because of changes in newer versions of glib.

These fixes are also available if one were to upgrade Ubuntu to use GNUbg 1.03.000 or 1.04.000.

Michael Petch (mpetch) on 2014-11-16
description: updated
description: updated

The attachment "patch to fix glib bugs" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
Michael Petch (mpetch) on 2014-11-16
description: updated
Adam Conrad (adconrad) wrote :

Based on the description, looks like this should be SRUed to both trusty and utopic, despite utopic being the only one with a glib that exposes the bug.

Changed in gnubg (Ubuntu):
status: New → Fix Released
Michael Petch (mpetch) wrote :

Updated patch (debdiff) to resolve this bug. Updated changelog entry. Original patch didn't include a bug fix to a global variable (shared among threads) that was being optimized away leading to instabilities in the software.

Adam Conrad (adconrad) wrote :

Given the following facts:

1) The package is at the same version in both trusty and utopic
2) The bug can only be reproduced on utopic, but obviously exists in both

I propose the following unorthodox SRU procedure:

1) The SRU is uploaded to trusty-proposed, so it builds with trusty level deps
2) The SRU is copied from trusty-proposed to utopic-proposed
3) The SRU is validated in utopic, where the bug can be properly reproduced and verified fixed
4) The SRU is released to both trusty-updates and utopic-updates

Since we'll be validating the identical binary, this should prevent us from having to validate on trusty, except to smoketest that it runs and doesn't appear to crash,

Michael Petch (mpetch) wrote :

Yes Adam, I concur that this issue is on both Trusty and Utopic thanks for amending the bug. I happened to update my original post earlier while you were commenting and suggested that on other versions with older Glib the issue is present but still invoking undefined behavior. So Trusty in my opinion is potentially unstable albeit with no known crash.

This issue with the mutexes hasn't existed for the entire lifetime of our project. This issue was introduced in May 2013 in preparation for our official v1.00.000 release. In particular this set of changes:

http://cvs.savannah.gnu.org/viewvc/gnubg/multithread.c?root=gnubg&r1=1.74&r2=1.75&sortby=date

The change was to accommodate some structural changes and avoid deprecated functionality in Glib >= 2.32.0. Unfortunately this issue slipped through the cracks. As a result, any official Ubuntu releases that used a version of GNUbg < 1.0 will not experience this problem (Which is any Ubuntu release before Trusty). That leaves only the GNUbg releases used in Trusty and Utopic as candidates for this change.

I don't have commit privileges so need sponsorship on this bug. My understanding of the process is a bit lacking so you will have to bear with me.

Michael Petch (mpetch) wrote :
Brian Murray (brian-murray) wrote :

I've uploaded this to the Trusty queue for review by the SRU team.

Changed in gnubg (Ubuntu Trusty):
status: New → In Progress
assignee: nobody → Brian Murray (brian-murray)
importance: Undecided → High
Changed in gnubg (Ubuntu Utopic):
status: New → Triaged
importance: Undecided → High

Hello Michael, or anyone else affected,

Accepted gnubg into trusty-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/gnubg/1.02.000-2ubuntu1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in gnubg (Ubuntu Trusty):
status: In Progress → Fix Committed
tags: added: verification-needed
Chris J Arges (arges) wrote :

Was Utopic going to be fixed as well?
Thanks

Michael Petch (mpetch) wrote :

I originally identified the problem in Utopic, so I would hope it would be fixed there too. AT least that was my wish.

Michael Petch (mpetch) wrote :

As an addendum, it manifests itself differently in Trusty. The application is usable, although it could in theory be unstable with multiple threads under the right conditions. It doesn't crash outright.

In Utopic the problem is fatal as it will crash right as you start playing which in effect renders the product unusable on that platform.

Brian Murray (brian-murray) wrote :

Once it builds in trusty-proposed the plan is to copy the package to utopic-proposed.

Michael Petch (mpetch) wrote :

I spent some time today playing with 1.02.000-2ubuntu1 AMD64 playing against the computer and doing rollouts and analysis. The latter two are processor and thread intensive. I didn't observe any problems that would be related to this fix.

I happened to copy the gnubg and gnubg-data files to my utopic system. When I ran it and started a new match it crashed and I saw this:

GNU Backgammon 1.02.000 Feb 11 2015
Copyright (C) 1999, 2000, 2001, 2002, 2003, 2004 by Gary Wong.
Copyright (C) 2013 by Gary Wong and the AUTHORS; for details type `show version'.
This program comes with ABSOLUTELY NO WARRANTY; for details type `show warranty'.
This is free software, and you are welcome to redistribute it under certain conditions; type `show copying' for details.
(No game) set gui showids on
(No game) save settings
(No game) new match
Attempt to unlock mutex that was not locked
Aborted (core dumped)

The date is clearly correct (Feb 11 2015) but it failed. I then downloaded the source tar ball, the debian.tar.xz and the dsc file. I then built the packaged from source with:

dpkg-source -x gnubg_1.02.000-2ubuntu1.dsc

and then

dpkg-buildpackage -rfakeroot -b

Running this release failed as well. I'm going to have to revisit the patch and see if something was missed (in the patch) or if there is some other environment related problem (library change/update etc). At the time (a few months ago) it did work as expected and I had thoroughly tested it.

Michael Petch (mpetch) wrote :

If I manually apply the patch to the original source and do a build it works. I noticed when I did the Ubuntu build from source that the patch didn't seem to get properly applied and the code it built was essentially the unmodified code (not patched). The only thing I see wrong was the patch had some fuzz in one of the hunks. Not sure if that is related to the issue or not. If I get a chance tomorrow I'll regenerate a new patch.

Michael Petch (mpetch) on 2015-02-12
tags: added: verification-failed
removed: verification-needed
benjamin (jesuisbenjamin) wrote :

I experience the bug on Elementary OS Freya. How can I fix this?

Changed in gnubg (Ubuntu Trusty):
status: Fix Committed → Confirmed
assignee: Brian Murray (brian-murray) → nobody
Changed in gnubg (Ubuntu Utopic):
status: Triaged → Won't Fix
Changed in gnubg (Ubuntu):
importance: Undecided → High
Changed in gnubg (Ubuntu Trusty):
status: Confirmed → Triaged
Marc Deslauriers (mdeslaur) wrote :

Unsubscribing ubuntu-sponsors as there is no actionable item.

Once an updated debdiff is posted, please subscribe ubuntu-sponsors again. Thanks.

The version of gnubg in the proposed pocket of Trusty that was purported to fix this bug report has been removed because the bugs that were to be fixed by the upload were not verified in a timely (105 days) fashion.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers