Comment 8 for bug 1672624

Revision history for this message
Matthew Booth (mbooth-9) wrote :

Well, the operator hasn't done a local delete, they've just done a delete. Nova did a local delete because nova compute wasn't running. We also wouldn't want to fence in this case because, in general, we don't want to kill running instances unless we really have to. Nova compute being down isn't 'really have to' under most circumstances because it can just be restarted: an HA response should focus on trying to get it back up again. I don't think we can reasonably blame the operator here.

As you say, though, this does mean that an attempt to delete the volume fails because ceph won't allow us to delete a volume that still has an active connection.

Presumably when nova compute eventually comes back up we will clean up the running instance which was deleted. At this point, presumably the ceph volume can also be deleted. IOW, if we wait a bit for normal maintenance to happen, this will resolve itself automatically. I think that's a cloudy 'working as expected', especially for a weird edge case like this.