Error 504 when disabling a nova-compute service recently down
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
masakari |
Triaged
|
High
|
Unassigned |
Bug Description
By default, the 1st task executed when receiving a notification for a host being down is disable_
When a host is down but still seen as up by the nova control plane (which can be up to 60 seconds to be reported), any attempt at disabling the nova-compute service ends up with a timeout and an error.
This bug was 1st reported in nova (see: https:/
As answered in the nova bug report, since Train it is expected to use the force-down api call instead of the disable service call (which is done in RH OSP instanceHA feature: https:/
Which raises another question. As described in the nova api-ref (https:/
An idea for now could be to drop the task and keep only the "wait 60 seconds" part, but I am missing history on why this task exists.
Changed in masakari: | |
status: | New → Triaged |
importance: | Undecided → High |