Failed retype with driver raised exception should set volume status to "error"

Bug #1305550 reported by Li Min Liu
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
Undecided
Unassigned

Bug Description

When retype volume, if driver does not support this feature, will return False, cinder-volume will create a new copy of the volume and copy data to the the new one.

If driver raise exception, it should report error and set volume status as "error", should not fall back to generic mechanism

Li Min Liu (liminliu)
Changed in cinder:
assignee: nobody → Li Min Liu (liminliu)
Revision history for this message
Li Min Liu (liminliu) wrote :
description: updated
summary: - retype failed when driver rise exception
+ Failed retype with driver raise exception should set volume status to
+ "error"
summary: - Failed retype with driver raise exception should set volume status to
+ Failed retype with driver raised exception should set volume status to
"error"
Changed in cinder:
status: New → In Progress
Revision history for this message
John Griffith (john-griffith) wrote :

There was quite a bit of debate around this, we might want to consider a "list-valid-types" for a volume, otherwise this becomes a bit of a guessing game and we litter the users env with volumes in an "error" state that they can't use.

The other option is a sub-status (last operation status) since there are cases where the volume is "ok" but it couldn't complete an operation.

I'm not crazy about just changing the status to 'error' only because we don't give the end user a method to know what to do here.

Revision history for this message
Jay Bryant (jsbryant) wrote :

John,

Through the review above, both Winston-D and Avishay have indicated that they feel that the volume status should go to error. Are you ok with this initial patch to set error status, which I believe is better than doing what we are currently doing, and then add a BluePrint for implementing something like list-valid-types . It doesn't seem appropriate to try to put that change in with this patch as it is more of a new function.

What do you think?

Revision history for this message
John Griffith (john-griffith) wrote :

Jay,
My point above was that we don't give an end user any way to "know" what valid retype combinations are. So having them just "guess" and then if they guess incorrectly putting their volume in an unusable state (error) seems like a bad end user experience. Keep in mind that the end user would have to contact an admin to troubleshoot what happened as well as reset the status on the volume to make it usable again. This seems like a really bad experience in my opinion.

Why not just set a special "new" status like "invalid-retype-request" that doesn't block it from being used going forward but still gives a reasonably informative status? Then clear this on any subsequent calls.. it would basically be treated just as "available".

Revision history for this message
Mike Perez (thingee) wrote :

Can we stop trying to make the status field complicated? I'd rather just say "error" and have more detailed field of why it's set to an error versus "this-volume-is-in-error-because-of-this". There is some headaches it yet another field that needs to be updated, but I think we just need an interface for updating state rather than calling the db api.

Revision history for this message
Jay Bryant (jsbryant) wrote :

Mike,

I agree that the status field shouldn't be complicated. We, however, need to improve Cinder's ability to communicate to the user what has gone wrong when something does go wrong. If there was a relatively easy way to get the error information that currently just goes into the logs into a status field that the user can see, that would be a huge improvement.

If we don't change the status field, however, is there a way to address the concern that John raises above about the volume being left in a state that the average user cannot recover from. That issue also need to be taken into account. Is there a way that we can leave the status field error but still have users be able to recover when the reason is something like 'Backend doesn't support the type your were trying to retype to.'

Revision history for this message
Mike Perez (thingee) wrote :

It shouldn't be up to the user to recover a volume. Cinder should just recover the volume and prepare it for a retry either doing itself or have the user try again (ideally itself). If a retry fails, there is something seriously wrong with what the user is trying to do. Either user error or the environment itself. Keep the status field simple. Error means just stop, the volume can't be recovered and not even cinder can help. Available means do something with it. When something is in a state that is more complicated than one word, more information can be found in the status description field.

Li Min Liu (liminliu)
Changed in cinder:
status: In Progress → Opinion
Revision history for this message
Alan Pevec (apevec) wrote :

stable-maint script misdetected this bug reference in d17277d37d41072b113738d95c0a3bf3a6ab2549
"This change is necessary to enable resolving bug 1305550"

no longer affects: cinder/icehouse
Revision history for this message
Sean McGinnis (sean-mcginnis) wrote : Bug Assignee Expired

Unassigning due to no activity for > 6 months.

Changed in cinder:
assignee: Li Min Liu (liminliu) → nobody
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers