Cache is not safe for concurrent use (by processes or threads)

Bug #459418 reported by James Westby
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
lazr.restfulclient
Fix Released
High
Jonathan Lange
lazr.restfulclient (Ubuntu)
Fix Released
High
Unassigned
Precise
Fix Released
High
Unassigned
Quantal
Fix Released
High
Unassigned

Bug Description

[Impact]
Concurrency issues in lazr.restfulclient are affecting some processes used by various teams; including Foundations, PS integration, etc.
The most affected users are those who make heavy use of scheduled tasks around launchpadlib (see bug 513116); where different calls to various parts of launchpadlib (or the bzr launchpad plugin) will concurrently try to access cache, step on each others' cache files and mangle the files causing errors for further calls using lazr.restfulclient / launchpadlib.

[Test case]
1) Update package.
2) Run the following command:
for i in `seq 1 50` ; do python -c 'from launchpadlib.launchpad import Launchpad; lp = Launchpad.login_anonymously("test-cache", "production"); lp.distributions["ubuntu"]' & ; done

After a few runs, on an affected system the spawned processes will write out backtraces due to mangled files, such as the following error:
cElementTree.ParseError: not well-formed (invalid token): line 40599, column 18

On a patched system, all spawned processes will terminate successfully.

[Regression Potential]
Due to the changes in caching method, it may be possible that lazr.restfulclient fails to write cache files because of path size (which would likely increase due to the change), or otherwise fails to properly cache data increasing delays in retrieving information that may have already been retrieved.

----

httplib2 says:

class FileCache(object):
    """Uses a local directory as a store for cached files.
    Not really safe to use if multiple threads or processes are going to.
    be running on the same cache.
    """

launchpadlib sub-classes that, but doesn't change that fact.

This means that with multiple processes or threads you will
suddenly get strange error messages (bug 404204).

This is made worse by the fact that creating an object uses a
global cache, and login_with does this with no way to override it.

The best way to fix this is to just fix the cache implementation
rather than everyone learning to use a private cache and
fixing launchpadlib to make this possible with the convenience
methods.

This is a pre-requisite for fixing bug 513116, which is about the fact that launchpadlib is not thread-safe.

Related branches

Gary Poster (gary)
Changed in launchpadlib:
status: New → Triaged
importance: Undecided → High
Gary Poster (gary)
description: updated
Changed in launchpad-foundations:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Martin Pool (mbp) wrote :

Does the cache actually save significant time at the moment? When I last looked, it would never avoid round trips.

Revision history for this message
Leonard Richardson (leonardr) wrote :

We only serve the Cache-Control header for the service root (WADL and JSON), so those are the only resources for which we currently avoid round trips. But here are three reasons to like the cache anyway:

1. Those service root representations are pretty big fish to have caught. They're requested every time a Launchpad instance is created, and the WADL representation in particular is rather large. Taking launchpadlib down to one request per week for these resources was a big performance win.

2. The cache is where we store the ETags that let us make conditional GET and PATCH requests, even for resources who don't have Cache-Control headers.

3. At any time we can decide to be less conservative, and start sending Cache-Control for other documents. Then the client-side cache will be more useful. We just haven't made that decision yet.

Steve Magoun (smagoun)
tags: added: oem-services
summary: - Cache is broken with multiple processes
+ Cache is not safe for concurrent use (by processes or threads)
Jonathan Lange (jml)
Changed in launchpadlib:
assignee: nobody → Jonathan Lange (jml)
status: Triaged → In Progress
Jonathan Lange (jml)
affects: launchpadlib → lazr.restfulclient
Jonathan Lange (jml)
Changed in lazr.restfulclient:
status: In Progress → Fix Committed
Revision history for this message
Colin Watson (cjwatson) wrote :

Is there any chance this could be released at some point? We keep running into this on precise-based infrastructure, so it'd be great to have this fix in quantal and SRUed into precise.

Changed in lazr.restfulclient:
milestone: none → 0.14.0
milestone: 0.14.0 → 0.13.0
Changed in lazr.restfulclient:
milestone: 0.13.0 → 0.12.1
status: Fix Committed → Fix Released
Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

Let's SRU this to Quantal and Precise; they're both 0.12.0, so just applying the particular diff should be good.

Changed in lazr.restfulclient (Ubuntu):
status: New → Fix Released
importance: Undecided → High
Changed in lazr.restfulclient (Ubuntu Precise):
importance: Undecided → High
Changed in lazr.restfulclient (Ubuntu Quantal):
importance: Undecided → High
Changed in lazr.restfulclient (Ubuntu Precise):
status: New → Confirmed
Changed in lazr.restfulclient (Ubuntu Quantal):
status: New → Confirmed
description: updated
description: updated
Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

Uploaded SRU package to quantal-proposed...

Changed in lazr.restfulclient (Ubuntu Precise):
status: Confirmed → In Progress
William Grant (wgrant)
no longer affects: launchpad
Revision history for this message
Dave Walker (davewalker) wrote : Please test proposed package

Hello James, or anyone else affected,

Accepted lazr.restfulclient into quantal-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/lazr.restfulclient/0.12.0-2ubuntu0.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in lazr.restfulclient (Ubuntu Quantal):
status: Confirmed → Fix Committed
tags: added: verification-needed
Revision history for this message
Steve Langasek (vorlon) wrote :

Hello James, or anyone else affected,

Accepted lazr.restfulclient into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/lazr.restfulclient/0.12.0-1ubuntu1.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in lazr.restfulclient (Ubuntu Precise):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote : [lazr.restfulclient/precise] verification still needed

The fix for this bug has been awaiting testing feedback in the -proposed repository for precise for more than 90 days. Please test this fix and update the bug appropriately with the results. In the event that the fix for this bug is still not verified 15 days from now, the package will be removed from the -proposed repository.

tags: added: removal-candidate
Revision history for this message
Brian Murray (brian-murray) wrote :

I tried creating this bug on Precise using the test case provided and was unsuccessful.

Revision history for this message
Brian Murray (brian-murray) wrote :

I was able to recreate this on Quantal, and verify the fix, but I had to create a bash script with the test case and run it 2 or 3 times in a row.

tags: added: verification-done-quantal
removed: removal-candidate
Revision history for this message
Brian Murray (brian-murray) wrote :

I also got the test case working and the verification done in Precise.

tags: added: verification-done-precise
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lazr.restfulclient - 0.12.0-1ubuntu1.1

---------------
lazr.restfulclient (0.12.0-1ubuntu1.1) precise-proposed; urgency=low

  * debian/patches/concurrency_fixes_rev_122.patch: backport rev 122: Creates
    AtomicFileCache as a parent class for MultipleRepresentationCache to
    handle concurrent use issues. (LP: #459418)
 -- Mathieu Trudel-Lapierre <email address hidden> Fri, 07 Jun 2013 08:48:59 -0400

Changed in lazr.restfulclient (Ubuntu Precise):
status: Fix Committed → Fix Released
Revision history for this message
Scott Kitterman (kitterman) wrote : Update Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lazr.restfulclient - 0.12.0-2ubuntu0.1

---------------
lazr.restfulclient (0.12.0-2ubuntu0.1) quantal-proposed; urgency=low

  * debian/patches/concurrency_fixes_rev_122.patch: backport rev 122: Creates
    AtomicFileCache as a parent class for MultipleRepresentationCache to
    handle concurrent use issues. (LP: #459418)
 -- Mathieu Trudel-Lapierre <email address hidden> Fri, 22 Feb 2013 15:01:50 +0000

Changed in lazr.restfulclient (Ubuntu Quantal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.