Saw this recently in a real deployment - a nova-compute service locked up for 12 hours (looks like that was a kernel bug) and during that time a user tried to hard reboot their instance
The reboot cast message was lost, so the instance stayed in task_state=REBOOTING_HARD. After the compute node came back, the user wasn't able to reboot the instance because:
Saw this recently in a real deployment - a nova-compute service locked up for 12 hours (looks like that was a kernel bug) and during that time a user tried to hard reboot their instance
The reboot cast message was lost, so the instance stayed in task_state= REBOOTING_ HARD. After the compute node came back, the user wasn't able to reboot the instance because:
@check_ instance_ state(vm_ state=[ vm_states. ACTIVE, vm_states.STOPPED,
vm_states. RESCUED] ,
task_ state=[ None, task_states. REBOOTING] )
def reboot(self, context, instance, reboot_type):
i.e. reboot is allowed while in task_state= REBOOTING_ HARD
Looking at https:/ /review. openstack. org/5090 and https:/ /review. openstack. org/12368 I'm reading into Vish's comments that there can be problems if you do attempt hard reboot while a hard reboot is in progress
ISTM that if that's a concern, the compute manager should just take a lock on the instance while it's rebooting