Comment 2 for bug 982108

Revision history for this message
Mark McLoughlin (markmc) wrote :

Saw this recently in a real deployment - a nova-compute service locked up for 12 hours (looks like that was a kernel bug) and during that time a user tried to hard reboot their instance

The reboot cast message was lost, so the instance stayed in task_state=REBOOTING_HARD. After the compute node came back, the user wasn't able to reboot the instance because:

    @check_instance_state(vm_state=[vm_states.ACTIVE, vm_states.STOPPED,
                                    vm_states.RESCUED],
                          task_state=[None, task_states.REBOOTING])
    def reboot(self, context, instance, reboot_type):

i.e. reboot is allowed while in task_state=REBOOTING_HARD

Looking at https://review.openstack.org/5090 and https://review.openstack.org/12368 I'm reading into Vish's comments that there can be problems if you do attempt hard reboot while a hard reboot is in progress

ISTM that if that's a concern, the compute manager should just take a lock on the instance while it's rebooting