Glance API layer doesn't catch some store exceptions leading to 500 errors

Bug #1264029 reported by David Koo
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
glance_store
Triaged
Medium
Dharini Chandrasekar

Bug Description

The Glance API layer currently doesn't catch some exceptions raised by the underlying store layer, resulting in a "500 Internal Server Error".

For e.g. when using the HTTP store, if the remote HTTP server is down, then an "image-download" operation will fail and the HTTP store will raise a BadStoreUri exception but the exception is not caught all the way to the top and results in the 500 error.

Again, if the remote server is up but the specified image file is not found then the remote HTTP server will return a 404 error but the HTTP store converts all 4XX errors to a BadStoreUri exception (HTTP store bug?) which again is not caught.

I think a) underlying stores should return appropriate error codes/exceptions so that the higher layers can take meaningful action or produce meaningful error messages and b) the API layer should catch all exceptions raised by the lower layers so that the user never gets to see a 500 error.

David Koo (kpublicmail)
Changed in glance:
assignee: nobody → David Koo (kpublicmail)
status: New → In Progress
David Koo (kpublicmail)
description: updated
Revision history for this message
Zhi Yan Liu (lzy-dev) wrote :

@David, your point #b is make sense to me, but for #a I think what current http store did is reasonable, convert backend storage exception to BadStoreUri is easy and enough, and this design can work for/cover more exception path for all store drivers currently.

Changed in glance:
importance: Undecided → Medium
Revision history for this message
David Koo (kpublicmail) wrote :

I was wrong about the first example - if the server is down then a socket.error (which is an IOError) is thrown by the underlying socket module, not a BadStoreUri error.

Regarding a catchall BadStoreUri, though it makes coding easier, I feel would it make debugging and troubleshooting more difficult. For e.g. Glance and the remote store are controlled by two different operators. Then, if the user encounters a failure for a Glance operation and only sees BadStoreUri, he doesn't know what to do and how to escalate the issue with the "store operator" - is it an authentication error, or that the object wasn't found or some permission error ... he doesn't know how to describe the problem to the store operator.

For that matter, he doesn't even know if it's a bug in the glance code logic that's giving rise to the problem. He has to guess what Glance is trying to do with the backend store and try to repeat that operation to see what error it was.

What do you think?

Revision history for this message
Zhi Yan Liu (lzy-dev) wrote :

@David, do you think backend store issue resolving/debug is a responsibility for cloud admin/operator or end user? I believe it is former, agree that? So I think in store drivers we can just write those backend storage related exceptions to Glance log only instead expose them to end user, since IMO a) those details is not valuable and make sensible to end user; and b) the exception probably can contains sensitive information, like Cloud internal deployment details and/or credentials.

Revision history for this message
David Koo (kpublicmail) wrote :

@Zhiyan: I was thinking something like the cloud admin/operator was one guy/company but the storage operator was another. But I admit, that was more a theoretical scenario - I have no proof or figures of how (un)common it is or would be in real deployments.

I think the middle road is good - we log those errors for the sake of debugging and diagnostics, but don't necessarily expose them to the end user.

And I agree - even though we log them, we must be careful not to expose sensitive information (even in the logs).

I'll upload the patchset on Monday when I get back to work. I hope you'll help out in the reviews :).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to glance (master)

Fix proposed to branch: master
Review: https://review.openstack.org/64409

Erno Kuvaja (jokke)
tags: added: propose-close
Changed in glance:
assignee: David Koo (kpublicmail) → nobody
status: In Progress → New
Revision history for this message
Ian Cordasco (icordasc) wrote :

Given the patchset sent by David seems to have been entirely contained within what is now glance-store, I've moved this over and we can determine if it still needs to be addressed there.

affects: glance → glance-store
Changed in glance-store:
importance: Medium → Undecided
Changed in glance-store:
assignee: nobody → Cindy Pallares (cindy-pallaresq)
Revision history for this message
Flavio Percoco (flaper87) wrote :

Cindy, are you working on this?

Changed in glance-store:
importance: Undecided → Medium
status: New → Triaged
Changed in glance-store:
assignee: Cindy Pallares (cpallares) → nobody
Changed in glance-store:
assignee: nobody → Dharini Chandrasekar (dharini-chandrasekar)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.