On 10/10/2013 08:00 AM, Jay Lau wrote:
> what about simply let nova compute report error directly if reboot a not
> running vm?
That's fine by me, though it would mean an incompatible change in
behavior. I don't what the policy is on such things.
> How to enable "soft reboot attempt can do a better job of checking for
> failures"? Can you please give more detail?
I mean currently, I'm guessing the logic is something like this:
soft_reboot()
while 2 minutes have not elapsed {
if instance has rebooted {
return success
}
}
hard_reboot()
When the soft reboot is attempted, there must be some error, because it
is logged (can't reboot a shutdown instance). But, it has absolutely no
bearing on the reboot procedure; it just waits two minutes, even though
it's clear (from the error) that the soft reboot has failed. The logic
could be this:
try {
soft_reboot()
} catch {
hard_reboot()
}
while 2 minutes have not elapsed {
if instance has rebooted {
return success
}
}
hard_reboot()
Or, if a change in behavior is acceptable, as you suggest, it could be:
if instance is not running {
return failure "can not reboot a non-running instance"
}
...reboot as it is currently done
On 10/10/2013 08:00 AM, Jay Lau wrote:
> what about simply let nova compute report error directly if reboot a not
> running vm?
That's fine by me, though it would mean an incompatible change in
behavior. I don't what the policy is on such things.
> How to enable "soft reboot attempt can do a better job of checking for
> failures"? Can you please give more detail?
I mean currently, I'm guessing the logic is something like this:
soft_reboot()
while 2 minutes have not elapsed {
if instance has rebooted {
return success
}
}
hard_reboot()
When the soft reboot is attempted, there must be some error, because it
is logged (can't reboot a shutdown instance). But, it has absolutely no
bearing on the reboot procedure; it just waits two minutes, even though
it's clear (from the error) that the soft reboot has failed. The logic
could be this:
try {
soft_reboot()
} catch {
hard_reboot()
}
while 2 minutes have not elapsed {
if instance has rebooted {
return success
}
}
hard_reboot()
Or, if a change in behavior is acceptable, as you suggest, it could be:
if instance is not running {
return failure "can not reboot a non-running instance"
}
...reboot as it is currently done
Either way, the confusion is eliminated.