Backend failures aren't being handled correctly

Bug #1252277 reported by Thomas Leaman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Glance
Triaged
Low
Dharini Chandrasekar

Bug Description

If a request to the backend fails partway through, it's possible that
the Exception raised is assumed to be an issue in glance itself. Thus
any request made to glance which fails in this way may not be terminated
correctly (which can lead to nova hanging). This passes the exception up
the call stack appropriately to terminate the request gracefully.

Changed in glance:
importance: Undecided → Low
status: New → Triaged
Revision history for this message
Mark Washenberger (markwash) wrote :

Some specific examples, especially leading to Nova hanging, would be fantastic here.

Revision history for this message
Thomas Leaman (thomas-leaman) wrote :

We've seen instances where a download from a Swift backend will drop the connection partway through (which I think raises an IOError in Glance) resulting in "Fetch of cache file failed ([Errno 104] Connection reset by peer)". That is, Glance is interpreting the error as relating to its caching of the image even though the error is actually in the connection to Swift.
If this occurs while Nova is trying to fetch an image from Glance, Glance will not report the error back to Nova and the connection between Nova and Glance will remain open (with no data flowing).

Revision history for this message
Mark Washenberger (markwash) wrote :

Very interesting.

This looks like something we need to fix in the caching layer. The problem seems to be that cache_tee_iter in glance/image_cache/__init__.py does not correctly distinguish between caching errors and other errors.

Maybe the best solution is to have open_for_write in glance/image_cache/__init__.py do a better job of distinguishing between exceptions that occur during cache_file.write(chunk) and exceptions that occur in the calling context (which would appear to occur in "yield chunk") ?

Changed in glance:
assignee: nobody → Dharini Chandrasekar (dharini-chandrasekar)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.